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Abstract 

This is a companion to another paper. Together they rebut two widespread philo- 
sophical doctrines about emergence. The first, and main, doctrine is that emergence 
is incompatible with reduction. The second is that emergence is supervenience; or 
more exactly, supervenience without reduction. 

In the other paper, I develop these rebuttals in general terms, emphasising the 
second rebuttal. Here I discuss the situation in physics, emphasising the first rebut- 
tal. I focus on limiting relations between theories and illustrate my claims with four 
examples, each of them a model or a framework for modelling, from well-established 
mathematics or physics. 

I take emergence as behaviour that is novel and robust relative to some com- 
parison class. I take reduction as, essentially, deduction. The main idea of my first 
rebuttal will be to perform the deduction after taking a limit of some parameter. 
Thus my first main claim will be that in my four examples (and many others), we 
can deduce a novel and robust behaviour, by taking the limit — J- cxd of a parameter 

But on the other hand, this does not show that that the N = oo limit is "physi- 
cally real" , as some authors have alleged. For my second main claim is that in these 
same examples, there is a weaker, yet still vivid, novel and robust behaviour that 
occurs before we get to the limit, i.e. for finite N. And it is this weaker behaviour 
which is physically real. 

My examples are: the method of arbitrary functions (in probability theory); 
fractals (in geometry); superselection for infinite systems (in quantum theory); and 
phase transitions for infinite systems (in statistical mechanics). 
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1 Introduction 



1.1 A limited peace 

'More is different!', proclaimed Philip Anderson in a famous paper (1972) advocating the 
autonomy of what are often called 'special' or 'higher-level' sciences or theories. A catchy 
slogan, indeed. But his reductionist opponents, such as Weinberg (1987), could have 
matched it, by invoking Mies van der Rohe's pithy defence of functionalist architecture: 
'Less is more'. Hence my title. For my main point will be that although emergence is 
usually opposed to reduction, many examples exhibit both. So my title, 'Less is different', 
is meant as an irenic combination of the two parties' slogans. I will spell out this reconcil- 
iation in two claims, illustrated by four examples. The two claims, mnemonically labelled 
(l:Deduce) and (2:Before), are defined in Section [L2| and each example is a model or a 
framework for modelling, from well-established mathematics or physics. 

My irenic title is also ironic. For it deliberately echoes the sceptical refrain that there 
is nothing new under the Sun. Though I will not name names, most would agree that 
there is a good deal of heat, and rather less light, in the debate about emergence vs. 
reduction. Here's hoping that you will not recite that same refrain after reading this 
paper! Of course the heat and dark is in part due to different authors giving 'emergence' 
and 'reduction' different meanings. Thus I do not claim to be the only author to celebrate 
these words' compatibility. Among other celebrants, albeit using different meanings, are 
Simon (1996, pp. 249-251) and Wimsatt (1997, pp. 99-100 and references therein) 111 

However, this is a companion to another paper (Butterfield 2010). So although the 
papers can be read independently, I should begin by describing their common aims and 
how they share out the work between them. In brief, both papers construe the contested 
terms, 'emergence' and 'reduction', as follows; (the other paper gives more details, and a 
defence of these construals; cf. its Sections 1.1, 2.1 and 3.1.1.). 

I take emergence as behaviour that is novel and robust relative to some comparison 
class. In particular, my examples will be typical of many, by using two widespread 
conceptions of what the comparison class is, as follows. (1): Composites: The system is a 
composite; and its properties and behaviour are novel and robust compared to those of its 
component systems, especially its microscopic or even atomic components. (2): Limits: 
The system is a limit of a sequence of systems, typically as some parameter (in the theory 
of the systems) goes to infinity (or some other crucial value, often zero); and its properties 
and behaviour are novel and robust compared to those of systems described with a finite 
(respectively: non-zero) parameter. (Section [3] will explain how these ideas, (1) and (2), 
are better put in terms of quantities and their values, rather than systems.) 

I take reduction as, essentially, deduction; though usually aided by appropriate def- 
initions or bridge-principles linking the two theories' vocabularies. This will be close 

^ Other playful variations on Anderson's slogan occur in Kadanoff's splendid historico-philosophical 
introductions to phase transitions (2009, 2010, 2010a): which I will advert to in Section [T) Cat (1998) 
is a scholarly review of the Anderson- Weinberg debate; Bouatta and Butterfield (2011) also contains a 
discussion. 
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to endorsing the traditional account of Nagel (1961), despite various objections levelled 
against it. The picture is that the claims of some worse or less detailed (often earlier) the- 
ory can be deduced within a better or more detailed (often later) theory, once we adjoin 
to the latter appropriate definitions of the proprietary terms of the former. I also adopt 
a mnemonic notation, writing for the better, bottom or basic theory, and Tf for the 
tainted, top or tangible theory; (where 'tangible' connotes restriction to the observable, 
i.e. less detail). So the picture is, with D standing for the definitions: T^^D =^ Tf. In 
logicians' jargon: Tt is a definitional extension of Tf,. 

In both papers, especially the other one, I consider a notion much discussed in the 
philosophy (but not physics) literature: supervenience (also known as 'determination' or 
'implicit definabihty'). This is a less contested term. It is taken by all to be a relation 
between families of properties: the extensions of all the properties in one family relative 
to a given domain of objects determine the extension of each property in the other family. 
Besides, under wide conditions, this is a weakening of the usual notion of the second family 
being definable from the first, which is called 'explicit definability'. Roughly speaking, 
this weakening allows a definition of a property P in the second family, in terms of the 
first family, to be infinitely long, rather than finite. 

Since the definitions used in a Nagelian reduction are finite, supervenience is widely 
taken to be a weakening of Nagelian reduction. Besides, various philosophers have con- 
sidered the infinity of "ways to be P" given by an infinitely long definition to be a good 
way of making precise the heterogeneity or multiplicity of realization that philosophers 
have often associated with emergence. Thus arose the doctrine that emergence is "mere 
supervenience" , i.e. supervenience without all the definitions being finite, as in a Nagelian 
reduction. 

With these construals of the terms, the papers aim to rebut two widespread doctrines 
about emergence: the doctrine just mentioned, that emergence is mere supervenience, 
found in the philosophy literature; and the more widespread doctrine, found also in the 
physics literature (including the Anderson- Weinberg debate), that emergence is incom- 
patible with reduction. 

In the other paper, I develop these rebuttals in general terms; including a discussion 
of some other possible construals of the contested terms. I also emphasise supervenience, 
and thereby the first rebuttal. Thus I give (i) examples of mere supervenience which 
are not emergence and (ii) examples of emergence which are not mere supervenience nor 
reduction. 

But in this paper, I will discuss the situation in physics and down-play supervenience, 
thus emphasising the second rebuttal. That is, I will argue that emergence is compatible 
with reduction, since physics gives examples combining both. The main idea will be to 
perform the reduction, i.e. deduction, after taking a limit of some parameter. Thus my 
first main claim, (l:Deduce), will be that in my four examples (and many others), we can 
deduce a novel and robust behaviour, by taking the limit — )■ oo of a parameter N. 

But on the other hand, this does not show that that the N = oo limit is "physically 
real", as some authors have alleged. For my second main claim, (2:Before), is that in 



5 



these same examples, there is a weaker, yet still vivid, novel and robust behaviour that 
occurs before we get to the limit, i.e. for finite A^. And it is this weaker behaviour which 
is physically real. 

This contrast between strong and weak senses of emergence, and respectively its ab- 
sence or presence at finite A^, will be the main common theme across my four examples. It 
will also illuminate another current topic within philosophy of physics, about the signifi- 
cance of 'singular' limits in a physical theory. In fact, some authors propose to characterize 
emergence in terms of 'singular' limits]^ I deny this proposal. Although my two claims, 
and my four examples combining emergence and reduction, involve taking a limit, the 
limit is singular in only two of the four examples (the second and fourth, viz. fractals and 
phase transitions). So emergence is not always a matter of a singular limit — ^just as it is 
not always a matter of mere supervenience. 

This negative verdict leaves open many questions, in particular: is emergence always a 
matter of a limit, whether singular or not? And even though a singular limit is not neces- 
sary for emergence, is it sufficient? In fact I think the answers to these questions are again 
'No'. But I will not attempt to give a detailed characterization of emergence, whereby to 
prove these last two 'No'sjj The literature contains several such characterizations, with 
various merits. But as I explain in the other paper (especially Sections 1.1, 2.1), I doubt 
that there is — and that there needs to be — a single best meaning of 'emergence'; and sim- 
ilarly for 'reduction'. Anyway, I can develop my claims and examples while adopting my 
construals — of 'emergence' as novel and robust behaviour, and of reduction as deduction 
a la Nagel. 

Before I give a prospectus (Section II. 2p . I should make two final comments about 
these construals, and about my choice of examples. First: I submit that my construals of 
'emergence' and 'reduction' are strong enough to make it worth exhibiting examples that 
combine them. Also, they seem to be in tension with each other: since logic teaches us 
that valid deduction gives no new "content" , how can one ever deduce novel behaviour? 
This tension is also shown by the fact that many authors who take emergence to involve 
novel behaviour thereby take it to also involve irreducibility. The answer to the 'how?' 
question, i.e. my reconciliation, will lie in using limits: one performs the deduction after 
taking a limit of some parameter. So one main moral will be that in such a limit there 
can be novelty, compared with what obtains away from the limit. 

Second: there is the issue of how I choose my examples. Here you may suspect 
what might be called the 'case-study gambit': trying to support a general conclusion by 
describing examples that have the required features, though in fact the examples are not 

have used scare-quotes since writers often use the term loosely — too loosely, as I explain, and 
complain, in Section [31 But for easier reading, I will henceforth drop the scare-quotes. 

■^Incidentally, for the last 'No', that a singular limit is not sufficient for emergence: I agree with 
Wayne's argument for this (against Rueger (2000, p. 308; 2006, pp. 344-345)). Wayne uses Rueger's 
own example, of the van der Pol oscillator (Wayne 2009, Sections 3-5). My second example, fractals, 
will give another counterexample (Section 15.2.1^ : the topological dimension of a sequence of sets Cn 
is discontinuous in the limit, i.e. limjv_^o dim Cn ^ dim limjv^oC'Af: but there is no emergence. For 
persuasive, more general, critiques of associating emergence or irreducibility with "singular asymptotics" , 
cf. Belot (2005, especially Section 5) and Hooker (2004, pp. 446-458). 
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typical, so that the attempt fails, i.e. the general conclusion, that all or most examples 
have the features, does not follow. But to this charge also I plead innocent, for the simple 
reason that I will not urge so general a conclusion, in the way that a reductionist opponent 
of Anderson might. (For example, I think Weinberg's objective reductionism (1987, p. 
349-353) implies that (with my meanings of the terms) all known examples of emergence 
are also examples of reduction.) On the other hand, I do aspire to some generality! It 
will be clear that my claims, in particular my two main ones, (l:Deduce) and (2:Before), 
are illustrated by many examples beyond the four I have chosen. So I submit that the 
claims reflect the amazing power of Nagelian reduction. 

1.2 Prospectus 

Thus my main aim is to reconcile emergence with reduction, by arguing for two main 
claims, illustrated by four examples. Each example is a model, or a framework for mod- 
elling, from well-established mathematics or physics; and each involves an integer param- 
eter N = 1, 2, ... and its limit — )■ oo. In three of the examples, N is, roughly speaking, 
the number of physical degrees of freedom of the system; in the second example, it is the 
number of iterations of a definitional process. 

In all the examples, is, physically speaking, finite. But we can consider the limit: 
both what happens on the way to the limit, and what happens at it. (In Section |3l I 
will be more precise about the meaning of 'what happens', in terms of quantities being 
well-defined and what their values are.) Doing so yields my two main claims. The first is: 

(l:Deduce): Emergence is compatible with reduction. And this is so, with a 
strong understanding both of 'emergence' (i.e. 'novel and robust behaviour') 
and of 'reduction' (viz. logicians' notion of definitional extension). In short: 
in the examples, considering N ^ oo enables us to deduce novel and robust 
behaviour, in strong senses of 'novel' and 'robust'. 

Besides, one needs to consider the limit in that: for each example, choosing a 
weaker theory using finite A^ blocks the deduction of this strong sense. And (as 
discussed in Section[3]), this weaker theory is appropriate and salient, i.e. liable 
to come to mind. Since the theories Tt and are often defined only vaguely 
(by labels like 'thermodynamics' and 'statistical mechanics'), this swings-and- 
roundabouts situation explains away some of the controversy over whether Tt 
is reducible to T^. 

The second claim is: 

(2:Before): But on the other hand: emergence, in a weaker yet still vivid 
sense, occurs before we get to the limit. That is: in each example, one can 
understand 'novel and robust behaviour' weakly enough that it does occur for 
finite A^. 

Of my four examples, I have chosen the first three to be comparatively small, simple 
and agreed-upon, so that the philosophical issues stand out more clearly. They are from 
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probability theory, geometry and quantum theory, respectively. The fourth example is an 
enormous topic in physics, with much less agreement. The examples are, in order: 

1: The method of arbitrary functions, in probability theory; (Section H]); 

2: Fractals, in geometry; (Section [5]); 

3: Superselection for infinite systems, in quantum theory; (Section [6]); 

4: Phase transitions for infinite systems, in classical statistical mechanics (Section [7]). 

Apart from the contrast between strong and weak senses of emergence shown by 
(l:Deduce) and (2:Before), there will be two other philosophical themes in common across 
the four examples. 

The first is that supervenience is a "red herring", i.e. irrelevant. (So this supports the 
other paper's rebuttal of the doctrine that emergence is mere supervenience.) For clarity, 
it will again be best to give this a mnemonic label, as follows: 

(3:Herring): Although various supervenience theses are true in the examples 
(and many others), the theses yield little or no insight — either into emergence, 
or more generally, into "what is going on" in the example. 

We can already state the basic reason for this irrelevance. Supervenience allows that for 
each property P in the "higher" i.e. supervening family of properties, there is, in the 
taxonomy given by the lower family, a disjunction of "ways to be P". But supervenience 
gives no "control" on this disjunction: not just in the sense that the disjunction might be 
infinite, but also that supervenience allows it to be utterly heterogeneous. In particular, 
no kind of limit is taken; and more generally, no connection is made between the variety, 
or infinity, of the disjunction and the limit processes, especially — )■ oo, which are crucial 
to the example. Thus supervenience is, at least in these examples (and, I submit, many 
others), too weak a concept to be enlightening. 

The other theme in common across the four examples is that each example becomes, 
for finite but very large A^, unrealistic in a vivid — one might even say: catastrophic — way. 
But this occurs for reasons external to current debates about emergence, reduction and 
the significance of limits of physical theories. It also will not undermine my (2:Before). 
This is because each of my examples illustrates (2:Before) for values of its parameter A^ 
much smaller than those at which the example becomes unrealistic in the catastrophic 
way. So I will not emphasize this theme. On the other hand, the theme seems to have 
been completely neglected in these debates' literature; so it is worth spelling out. I will 
do this in Section [2|, again giving it a mnemonic label, (4:Unreal). 

After I discuss (4:Unreal), I give in Section |3] a general discussion of physical systems 
and their states and quantities, emphasizing the topic of limits: i.e. limits of systems, 
states and quantities, as some parameter A^ (typically the number of degrees of freedom) 
goes to infinity. There are two related philosophical questions to be addressed. The first, 
mentioned in Section II. ![ is whether emergence can be characterized in terms of limits, 
especially singular limits. Contra some authors, I deny this; (along with others such as 
Wayne, Belot and Hooker). 

The second question is whether in some examples, the singular limit is — not just 
indispensable for deducing emergence in some strong sense, or for epistemic concerns such 
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as explanation and understanding — but also 'physically real'. These two questions are 
related in various ways: most obviously, by a Yes to the second implying that emergence 
according to the first would be physically real. 

I shall also deny this: again, contra some authors. In more detail: my first claim, 
(l:Deduce), will illustrate how limits can be indispensable — viz. to deducing some novel 
and robust behaviour, where the behaviour in question is taken in a strong sense. And 
on the other hand, my second claim, (2:Before), will bring out how the N = oo limit is 
not physically real. That is: only the weaker sense of emergence that occurs at finite N 
is physically real. 

So let me sum up my claims. Emergence is not in all cases failure of reduction, even in 
the strong sense of reduction given by deduction (cf. (l:Deduce)). (Here, the deduction's 
need to invoke auxiliary definitions of the reduced theory's terms is made precise by 
logicians' notion of definitional extension; for details, cf. Section 3.1 of the companion 
paper). Nor does emergence in all cases occur only in the limit of the relevant parameter 
(cf. (2:Before)). Nor is emergence in all matter of this limit being "singular" in 

some sense: my first and third examples will have non-singular limits (cf. also Section |3]). 
Nor is emergence in all cases supervenience; nor is it in all cases failure of supervenience; 
(cf. Section 5 of the companion paper, and for the latter denial, (l:Deduce)). In short: 
we have before us a varied landscape — emergence is independent of these other notions. 

2 Becoming unrealistic on the way to the limit 

As we will see, my examples (and many other models, such as continuous models of fluids 
and solids) are examples of: formulating a formalism by taking an admittedly unrealistic 
limit of a parameter's value. But they are also examples of: formulating a formalism by 
taking a limit of a description which is admitted to be unrealistic on the way to the limit. 
This is my fourth labelled claim: 

(4:Unreal): Each of the four examples becomes unrealistic before one gets to 
the N = oo limit — regardless of any technical issues about that limit, and 
regardless of any philosophical controversies about emergence. 

One reason I need to discuss this claim is to show how it is consistent with (2:Before): 
the main point will be (as I mentioned) that (2:Before) applies to much smaller values 
of A^. But phase transitions will also yield a remarkable illustration of "oscillations" 
between (2:Before) and (4:Unreal). In Section 17.2. 2[ we will see how a system can be 
manipulated so as to first illustrate (2:Before), i.e. an emergent behaviour at finite A^, 
then lose this behaviour, i.e. illustrate (4:11 ureal), and then enter a regime illustrating 
some other emergent behaviour (or revert to the first behaviour): a phenomenon called 
^ cross- over' . 

There are also two other reasons why it is worth stating this claim, i.e. reasons 
unrelated to my own position in debates about emergence. First, almost all discussions 
of emergence, or more generally of limiting relations between theories, in the physics and 
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philosophy hteratures, fail to notice this point. Agreed, some maestros notice it — though 
I do not mean to argue from authority! Thus Feynman: 'When you follow any of our 
physics too far, you find that you always get into some kind of trouble' (1964, Lecture 
28.1)0 

Second, there is a common kind of reason for the un-realisticness ("break-down") of 
the examples. Besides, this kind of reason inevitably besets many other examples of 
taking limits of models as a parameter N, encoding physical degrees of freedom or some 
analogous concept, goes to oo. So this commonality is worth registering, especially in 
discussions of emergence, or more generally of limits of models as a parameter N ^ oo. 

In short, the commonality is: as becomes very large, the example runs up against 
either the micro-structure of space and its contents (for short: atomism), or the macro- 
structure of space and its contents (for short: cosmology). Thus my first two examples will 
run up against atomism: that is, very large will correspond to atomic or sub-atomic 
lengths, making what the example says utterly unrealistic. And my third and fourth 
examples will run up against cosmology: very large A^ will correspond to cosmic lengths 
(and so gravity, and indeed spacetime curvature), again making what the example says 
utterly unrealistic. I stress that these break-downs are not internal to the model, but 
in relation to the actual world. To take my third and fourth examples: if there were no 
gravity nor spacetime curvature, and if space had the structure of IR^, these examples, 
which postulate a chain of A^ spins or a gas of A^ molecules, in without gravity, would 
indeed remain realistic as A^ grows without bound. 

I say 'in short', because in some examples Feynman's 'some kind of trouble' is not 
just either atomism or cosmology. The situation can be more varied. I will not enter 
into details, let alone try to classify the kinds of trouble. But to illustrate: my first 
example, the method of arbitrary functions, will include a model of a roulette wheel 
whose angular velocity tends to infinity; so the trouble will be, not atomism, but the 
fact that the model is Newtonian not relativistic! And more importantly: in my fourth 
example, phase transitions, some models run up against both atomism and cosmology. 
For in some models, the thermodynamic limit is not just the idea that keeping the density 
constant, the number A^ of molecules (and so the volume) tends to infinity: there are also 
conditions on the limiting behaviour of short-range forces. 

However, (4: Unreal) plays a different role in my discussion from my other three labelled 
claims: so I will not emphasize it as much as the others. There are two differences. First: 

should mention another meaning of 'intermediate between small and infinite values of a parameter' 
that is noticed by the physics literature, under the label 'intermediate asymptotics': namely, a system's 
behaviour 'for times, and distances from boundaries, large enough for the influence of the fine details of 
the initial and/or boundary conditions to disappear, but small enough that the system is far from the 
ultimate equilibrium state' (Barenblatt 1996, p. xiii; cf. also p. 19). This meaning is obviously very 
different from this Section's 'intermediate N' regime. But it is worth mentioning, not just because of 
its intrinsic importance, but also because: (i) it is related to renormalization, which I will touch on in 
Section [7.2.2l fcf. also Goldenfeld et al. (1989)); and (ii) some philosophers (to their credit) have discussed 
it — though surely Batterman goes much too far when he writes 'I think, as should be obvious by now, 
that any investigation that remotely addresses a question related to understanding universal behavior 
[i.e. in philosophers' terms: multiple realizability] will involve intermediate asymptotics as understood 
by Barenblatt' (2002, p. 46). 
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with one exception, discussions of these examples and many others — including discussions 
about emergence, and the examples' N = oo limit — do not, so far as I know, mention this 
un-realisticness for very large N. (The exception is my second example, viz. fractals.) 
Second: in each of my four examples, this un-realisticness for very large is not relevant 
to the ways that: 

(1) : a strong sense of emergent (i.e. novel and robust) behaviour can be deduced 
at the limit (cf. (l:Deduce)); and 

(2) : a weaker, yet still vivid, sense of emergent behaviour occurs on the way to the 
limit (cf. (2: Before)); and 

(3) : supervenience is a red herring, giving little or no insight into the example (cf. 
(3:Herring)). 

So my discussion of emergence, in particular my main positive aim — the reconciliation 
got by combining my first two claims (l:Deduce) and (2:Before) — can proceed without 
discussing (4:Unreal). I stress again that each of the four examples illustrates (2:Before) 
for values of its parameter N much smaller than those at which the example becomes 
unrealistic. (And this point applies in many other examples of taking limits of models as 
a degrees-of- freedom parameter goes to oo.) So to keep the discussion of my examples 
as simple as possible, I will not explicitly refer there to (4:Unreal) — except at (i) the end 
of the second example, fractals, for which, as I said, the literature has noticed the point; 
and at (ii) the end of the fourth example, where the phenomenon of cross-over subtly 
combines (4:Unreal) with (2:Before). 

3 Systems, states, quantities, values — and their lim- 
its 

In Section [T72| I promised that my two claims, (l:Deduce) and (2:Before), would clarify — I 
dare not say resolve! — the question whether in some examples of 'infinite' and-or 'singular' 
limits, the limit is not just epistemically indispensable but also 'physically real'. More 
specifically, I said I would agree about the indispensability, thanks to (l:Deduce), but deny 
the reality, thanks to (2:Before). But even before I show those claims in my examples, I 
can defend my general position; and in particular, justify my denying the physical reality 
of the limits. That is the job of this Section. 

This is a job worth doing for two reasons. First, some discussions of emergence, and 
more generally, of limiting relations between theories are sloppy in their use of mathemat- 
ical jargon about limits being 'singular' vs. 'regular/well-behaved/continuous' etc. And 
as I mentioned in Section [TTTl some authors even identify emergence with what happens at 
a 'singular' limit (Batterman (2002, pp. 6, 120, 127, 135), (2006, pp. 902-903), (2009, pp. 
23-24); Rueger (2000, p. 308), (2006, pp. 344-345)). At least for my sense of emergence 
as novel and robust behaviour, this is wrong. In two of my four examples, there is nothing 
'singular' about the limit. And recall that footnote |3] cited other arguments (and other 
authors) to the effect that a singular limit is not sufficient for emergence or irreducibility. 

Second, some of the literature's physical examples and philosophical discussions are 
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dauntingly complex. To take just one current philosopher: Batterman's examples include: 
(a) ray optics as a limit of wave optics; (b): classical mechanics as a limit of quantum 
mechanics; (c): hydrodynamics as a limit of molecular models; (d): phase transitions as 
described in the thermodynamic limit of statistical mechanics. Each of these is a large 
and complex area of physics, in which recent decades have seen a lot of deep and beautiful 
work — some of whose creators have themselves given masterly philosophical discussions 
(e.g. Berry 1994, Goldenfeld et al. 1989, Kadanoff 2009, 2010, 2010a). So there is a great 
deal for philosophers to address; (and all credit to Batterman and others for doing so). 
But we run the risk of being blinded by science, i.e. being misled by arcane technicalia. 
So I propose to discuss just one area, and even that only briefly: phase transitions, which 
will be my fourth example. (As I mentioned in Section [L2| I chose my first three examples 
partly for their merit of being comparatively small and simple, so that the philosophical 
issues are clearer.) There is also a mountain of previous philosophical discussion, far too 
large to be addressed here. For apart from current authors like Batterman and Rueger, 
limiting relations between physical theories (singular or not) have long been a topic for 
authors such as Post, Schaffner, Scheibe, Rohrlich and Redhead. So I propose here just 
to spell out the general situation, as I see it. That will be enough to indicate how (at 
least in my examples!) there is no reason to believe the limit is physically real — a verdict 
which my examples will then confirm. 

I divide the task in three subsections, 13.11 to 13.31 Sections 13.11 and 13.21 lay out some 
distinctions. Then Section 1331 addresses the philosophical issue of what justifies our using 
a description with N = oo. There I argue that even when the relevant limits are singular, a 
straightforward and broadly instrumentalist justification, viz. mathematical convenience 
and empirical correctness, applies: so that we need not believe the limit is physically real. 

3.1 Emergence with and without infinite systems — and with or- 
dinary hmits 

We begin by envisaging physical systems, a say, each labelled by its parameter N, and 
thus a sequence of ever larger systems o"(A^). In all that follows (including my examples) 
G N := the set of natural numbers. But nothing in this Section or the sequel depends 
on this: we could have G IR. We need to distinguish three questions, about systems, 
quantities and values respectively. 

(1) : One can ask whether this sequence has as a limit, in the sense of there being (as 
a mathematical entity) a natural well-defined infinite system a{oo). 

(2) : One can ask whether a sequence of quantities on successive systems, say f{N) : = 
f{a{N)), has a limit, which we might denote by f{oo). (Of course, the physical idea of 
each member of such a sequence will be in common, e.g. energy or momentum: but we 
distinguish the members by their being quantities on different (sizes of) system.) 

(3) : Finally, one can ask whether a sequence of real number values of quantities on 
successive systems, say v{f{N)) := v{f{a{N))), has a limit. 

Of course, question (3) is the most familiar. The notion of limit is the elementary 
notion from calculus, limAr_^oo 'v{f{N)). Here a sequence of states, sn say, on the <j{N) is 
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to be implicitly understood, so as to define values for the quantities f{N); but to simplify 
notation, I will for the most part not mention sn, and indeed take states as understood. 
Recall also from the calculus that if a real sequence G IR grows without bound, i.e. 
for any number M the eventually remain greater than M, we write: limTv-s-oo ^at = oo. 
This is of course different from the idea (in Section 13.21 below) of taking oo as a possible 
value of the parameter or label on v, i.e. the idea of a sequence element Vod G H, which 
is after the denumerable sequence vn, 

But we can also make sense of the first two questions. As to (1): in both classical 
and quantum physics we can often define the limit of a sequence cr{N). Some approaches 
individuate a system by its state-space, and then use infinite cartesian or tensor products 
(for the classical and quantum cases respectively). Other approaches individuate a system 
by its set (in fact: algebra) of quantities, and then define limit algebras. This leads to 
how we make sense of (2). The algebra of quantities usually has a mathematical structure 
(in particular a topology) that enables one to define the limit of a sequence of quantities 
(i.e. not just, as in (1), a limit of a sequence of their values). 

Note that the existence of an infinite system a{oo) should not in general be identified 
with the existence of a limit quantity /(oo), or even several such; nor with the sequence 
of values v{f{N)) having a limit in the ordinary calculus sense. Indeed, my first example 
(the method of arbitrary functions) will illustrate this. There will be no infinite system 
cr(cxo), but the sequences of values v{f{N)) will each have a limit in the ordinary sense — in 
fact a finite one, viz. i. These limits are in no way 'singular'. Yet there will be emergence, 
i.e. novel and robust behaviour. 

There are also cases where there is (as a mathematical entity) an infinite system, and 
quantities defined on it whose values are the ordinary (in no way 'singular') limits of values 
on the finite systems; and where there is emergence. My third example (superselection in 
quantum theory) will illustrate this. 

And finally there are cases that suit the enthusiastic talk about singular limits! That 
is: cases where there is (as a mathematical entity) an infinite system, and quantities 
defined on it that take "new" values, i.e. values different from the limits of values on 
the finite systems. My second and fourth examples (fractals and phase transitions) will 
illustrate this, the emergence being shown by these new values]^ Section 13.21 gives a few 
more details. 

3.2 The limit of a sequence vs. what is true at that Umit 

The mathematical idea of this distinction is elementary. Recall that if we adjoin the 
number oo to the natural numbers N, then we can consider sequences of real numbers 
Vn G IR, with n G NU {oo}, i.e. sequences of order-type u + 1. For such sequences we can 
define the ordinary notion of limit, i.e. lim„gN Vn', and then of course we recognize that 

^My examples will of course need rather more calculus: e.g. we will need to distinguish between 
different kinds of convergence. 

^But as argued in footnote [31 such discontinuous limits are not sufficient for emergence. 
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there are cases in which hm w„ := hm„gN f„ exists and is not equal to v^o- For v^o means 
the {oj + l)-th member of the sequence — a quite different idea from the ordinary hmit! 

Section I3.1f s idea of an infinite system a{oo) allows us to apply this mathematical 
idea. We simply interpret adjoining the number oo to the set of finite values of as 
considering the infinite system cr(oo), as well as the finite systems cr(iV). I shall spell this 
out: first (a) for values of quantities, and then (b) for quantities themselves. 

(a) : Values of quantities: Suppose: (i) a sequence v{f{N)) of values of a quantity has a 
limit lim7v-!.oo ^(/(^)) as tends to infinity (as mentioned in Section [STTl a sequence of 
states Sat is here understood, so that one might write v{f{N), sjy)). And suppose also: 

(ii) there is also a well-defined infinite system cr(oo) on which: 

the common physical idea of the various f{N) makes sense and gives a natural 
well-defined hmit quantity, which we might write as /((t(oo)) (on (j(oo)); and on which 
there is a natural well-defined limit state, s say. 

Then we need to distinguish: 

(i) the given limit of the values, limAr^oo f^ifi^)) = ^^^n^oo v{f{N, s^)), from 

(ii) the value f (/(cr(oo), s) of the natural limit quantity /((t(oo)) in the natural limit 
state, s. 

(b) : Quantities: For quantities themselves, rather than values, the point is in essence the 
same. The statement is a close parallel of that in (a): indeed, shorter since we refer only 
to quantities, not to values of quantities — albeit thereby more abstract. Thus suppose: (i) 
a sequence of quantities f{N) has a limit, dubbed f{oo) in Section [3?T1 And suppose also: 
(ii) there is also a well-defined infinite system cr(oo) on which the common physical idea 
of the various f{N) makes sense and gives a natural well-defined limit quantity, which we 
might write as /(cr(oo)) (on a{oo)). 

Then we need to distinguish: 

(i) the given limit, f{oo) := lim7v-s.oo/(A^), from 

(ii) the natural definition of the quantity /(cr(oo)) on cr(oo). 

3.3 Justifying N = oo 

3.3.1 Distinguishing straightforward from mysterious cases 

'Justifying A^ = oo' is of course a shorthand! For — to sum up Sections 13.11 and \^72\ — we 
have just learnt to distinguish two numbers: although in some models they are both well- 
defined and equal, they need not be! Namely: 

(i) : the limit limAr_j.oo v{f{N)) of a sequence of values (which limit might equal ±oo); 

(ii) : the value f (/(o"(oo)) of the natural limit quantity /(cr(oo)) on the infinite system 
cr(oo). 

So if we ask the question what justifies an 'W = oo" model or description of a system, for 
which A^ is actually finite, we must allow that the answers may be different for different 
models. (Here and in the rest of this Subsection, I consider, for simplicity, just values of 
quantities as in (a) of Section [3l2l not quantities themselves, as in (b) of Section [3l2l ) 
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We of course expect a straightforward justification for the two cases of 'non- singular' 
hmits, i.e. the cases: 

(a) : (i) is well-defined (though perhaps = ±00), but there is no infinite system so that 
(ii) is ill-defined; (cf. my first example, the method of arbitrary functions); 

(b) : there is an infinite system, and (i) and (ii) are both well-defined and are equal; 
(cf. my third example, superselection in quantum theory). 

Namely, we expect a justification in terms of convenience and correctness, along the 
lines: 

(Straightforward Justification): The use of the infinite limit — i.e. the use of 
(i) for case (a), and the use of (i) = (ii) for case (b) — is justified, despite 
being actually finite, by its being mathematically convenient and empirically 
correct (up to the required accuracy). 

I shall develop and endorse this Justification in Section 13.3.31 

On the other hand, for 'singular limits', i.e. cases where (i) and (ii) are both well- 
defined but are not equal, and (ii) rather than (i) is empirically correct, matters are surely 
not straightforward. Such cases seem mysterious. Faced with such a case, should we give 
up the assumption that N is actually finite? But in some examples, e.g. where N is the 
number of molecules in a sample of gas (as in my fourth example, phase transitions), this 
apparently amounts to giving up th^ atomic constitution of matterQ 

Nevertheless, some advocates of the philosophical importance of singular limits give 
up, or at least come very close to giving up, A^'s finiteness. I take as examples, three 
quotes from Batterman (his italics): 

'a physically singular problem ... the "blow-ups" or divergences ... are the 
result of the singular nature of the physics' (2002, p. 56); 'real systems exhibit 
physical discontinuities ... genuine physical discontinuities — real singularities 
in the physical system' (2005, pp. 235-236); 'no de-idealizing story is possible 
even in principle' (2010, p. 17). 

Agreed, in other passages, he holds back (thank goodness!): 

'in (2005), I do speak rather sloppily of genuine physical singularities. It is 
best to think instead in terms of some kind of genuine qualitative change in 
the system at a given scale' (2010, p. 22); 'fiuids are composed of a finite 
number of molecules' (2006, p. 903); 'water in real tea kettles consists of a 
finite number of molecules' (2010, p. 7; this quotation also occurs, together 
with its surrounding passage, at 2009, p. 9). 

Note that this mysteriousness does not depend on (i) being well defined. If the v{f{N)) 
have no limit, not even ±00, nevertheless the actual value is presumably v{f{No)) for 
some actual but unknown A"o. So (ii) being empirically correct means that v(/(A^o)) ~ 
f (/((t(oo))) up to the required accuracy. But how can that be? 

''Thanks to John Norton for stressing this point — as a reductio, of course. 
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3.3.2 Dissolving the mystery 

I think the mystery can be dissolved, in two stages. (1): First, 1 will concede that to 
deny that is finite might be a reasonable move. But in all the examples 1 know, in 
particular in all of my examples, this move is wrong. (Here, the important point is that 
in my second and fourth examples, fractals and phase transitions, this move is wrong: for 
as noted in Section 13.3.11 my first and third examples of emergence have no mysterious 
'singular limit'.) So the more important stage will be the second one, (2) below: viz., 
that we need to consider quantities other than f. I turn to details. 

(1) : Denying that N is finite; other degrees of freedom: — 1 admit it can be reasonable 
to deny that is finite. But this means something less radical than denying atomism! 
Rather we conclude that the finite- model has not picked the right, or not all the right, 
degrees of freedom for understanding the system; and that the (model of the) infinite 
system has somehow 'clued in to' the missing relevant degrees of freedom, as shown by 
its empirical correctness. 

My fourth example, phase transitions in statistical mechanics, provides a putative 
example. Assuming that the correct description of a boiling kettle requires infinitely many 
degrees of freedom, it is reasonable to say that, since the kettle contains finitely many 
atoms, and so finitely many mechanical degrees of freedom, other degrees of freedom — 
e.g. of the electromagnetic field — must somehow be involved. Reasonable: but very 
programmatic! In fact, there is good evidence that the electromagnetic field is not involved 
in phase transitions — suggesting that the answer to the mystery lies elsewhere ... 

(2) : Other quantities: — The mystery is an artefact of focussing on just one quantity 
(/ in my notation). Once we consider appropriate other quantities (and maybe related 
mathematical notions), the mystery dissolves. Thus in my second and fourth examples 
(fractals and phase transitions), there are other quantities, for which (despite /'s singular 
limit) the finite- iV model, for large N, is close to the values given by the infinite model: 
and is thereby also empirically correct. In fact, these other quantities are 'cousins' of the 
quantity / which we first considered. Thus the mystery will be dissolved by my second 
claim, (2:Before): namely, we see a weak yet vivid version of the emergent behaviour before 
we get to the limit. Besides, 1 would claim — though 1 cannot defend it in this paper — 
that this is so in all of physics' similar cases (in particular, in Batterman's examples from 
optics, semiclassical mechanics and hydrodynamics). 

Agreed, for me to say 'there are other quantities or notions for which the finite- iV model 
is close to the infinite model' or 'we see a weak version of emergence before the limit', is 
unsatisfyingly abstract. Indeed, it is dismayingly close to the mysterious explanandum, 
viz. that the infinite model is empirically correct! But 1 submit that at this very general 
level, these formulations are the best one can do. To see vividly how the mystery dissolves, 
one has to look at examples; cf. my second and fourth examples. But here is a simple 
mathematical example illustrating the issues — and that there really is no mystery! As we 
shall see, it is not just a mathematical toy: it models physical situations, especially phase 
transitions. 
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Consider the sequence of real functions (^a? : IR — )■ IR, G N, defined by 



gN{x) := -1 iff X < — ; (3.1) 
gN{x) := iVx iff ^ < X < ^ ; (3.2) 
g^{x) := +1 iff ^<x. (3.3) 

Thus gN(yX) is constant and equal to —1 (respectively +1) for x less than ^ (respectively: 
greater than j^); and it increases linearly, with gradient A^, over the interval j^], so 
that for all A^, gN^O) = 0. Each g^ is continuous; but the sequence has as its limit the 
function g^o given by 

goo{x) = -1 iff X < ; ^oo(O) = ; g^{x) = 1 iff < x ; (3.4) 

which is discontinuous at o|^ So this limit is 'singular' in the sense that continuity is lost. 

We can make this more formal by introducing a two- valued quantity /at, G NU{oo} 
that encodes whether or not (/at is continuous: f ^ := if gj\f is continuous and /at := 1 if 
gN is discontinuous. Then we have: /at = for all finite A^ G N, but /oo = 1- So in our 
(i)/(ii) notation (from Sections 13.21 and r3.3.ip . we have a case where (i) and (ii) are both 
well-defined but are unequal. 

But there is no mystery here! There only seems to be a mystery if we look solely at 
the two- valued quantity /at, whose values report that the limit is 'singular', but which 
say nothing about how "close", for large A^, the g^ are to goo- 

Besides, there remains no mystery if we add some physical interpretation to the dis- 
cussion. Thus imagine that the values of gj\f in a neighbourhood of 0, or the slope of g^ 
thereabouts, are part of a model of a system with A^ degrees of freedom. A^ varies, and is 
in general large; so that one considers the sequence of functions g^- Now imagine that for 
large A^, it is hard to know the actual value A'o of A^ and-or hard to calculate the value 
of gN, even if you know x. (Agreed: my example is so simple that only a dimwit could 
find the calculation hard! Such is the price of a simple example ...) In this situation, it 
obviously could be both (a) mathematically convenient and (b) empirically accurate — i.e. 
close enough to the predictions made by gNo{x) for the actual x — to work with g^o- 

For as to (a): (^oo's being discontinuous need not make it inconvenient. Better the 
discontinuous goo that you can get a grip on, than the hard-to-know and-or hard-to- 
calculate gNo^- And as to (b): as A^ grows, the range of x for which gN{x) ^ fi'oo(x) 
becomes arbitrarily small. Besides, for x = — which might be a physically significant 
argument — goo is completely accurate: i.e. for all A^, fi'Af(O) = (700(0). In short: again, no 
mystery. There only seems to be a mystery if we look solely at /at, and ignore the details 
about g^ and goo- 

Finally, I stress that this mathematical example has two other features that make it a 
good prototype for my "singular limit" examples — i.e. my second and fourth, fractals and 

®The convergence is pointwise not uniform: uniform limits of continuous functions are continuous. 
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phase transitions; (hence my choice of it!). First: each example will have a two- valued 
quantity Jn, N e NU{oo}, with /at = for all iV G N and foo = 1- In fact, this quantity 
simply records the presence or absence of the emergent novel property; with presence 
encoded by / = 1. So the jump in the value of / corresponds to my claim (l:Deduce). 

Second: in phase transitions (my fourth example, Section [7]) there are physical quan- 
tities for finite models whose gradients grow without bound as — oo, just like this 
example's gradients of (7 at in a neighbourhood of 0. So the remarks here, about the un- 
mysterious mathematical convenience and empirical accuracy of Qod, will apply — word for 
word! 

Let me look ahead a little to Section [TJ especially Section r7.1.2[ B (if only to placate 
ajjicionadosl) . Consider the phase transition of a ferromagnet at sub-critical tempera- 
tures, as described by the Ising model with N sites (in two or more spatial dimensions). 
The magnetization behaves, as a function of the applied magnetic field, very like this 
example's qn- Thus suppose our variable x represents the value of the applied field (in a 
given spatial direction). Then to a good approximation, qn represents the average mag- 
netization (in appropriate units). So as the applied field passes from negative to positive 
values, the ferromagnet's magnetization flips from -1 (i.e. alignment with the field in the 
negative direction) to +1 (alignment in the positive direction). But for larger A^, the 
ferromagnet "lingers longer": the larger number of sites gives it more "inertia" before 
the rising value of x succeeds in flipping the magnetization from -1 to +1. (Here, my 
qualifying phrase 'to a good approximation' refers to the Ising model's magnetization 
being a smooth function of the applied field (in fact given, in mean field theory, by the 
hyperbolic tangent function tanh), and so without sharp corners at ±1/N like my g^.) 
Thus the magnetic susceptibility, defined as the derivative of magnetization with respect 
to magnetic field, is, in the neighbourhood of 0, larger for larger N, and tends to infinity 
as — )■ 00: compare the gradients of (^at in my example. Very similar remarks apply 
to liquid-gas phase transition, i.e. boiling. Here the quantity which becomes infinite in 
the — 7- 00 limit, i.e. the analogue of the magnetic susceptibility, is the compressibility, 
defined as the derivative of the density with respect to the pressure]^ 

To sum up: I have dissolved the mystery about cases in which (i), i.e. the limit of the 
finite model, is not equal to (ii), the infinite model, and in which (ii) is empirically correct, 
by arguing that there are other quantities {g rather than /, in my notation) for which (i) 
is close to (ii) (and so, also, empirically correct). I can therefore turn to elaborating and 
endorsing the Straightforward Justification which I announced in Section [3.3.11 in short, 
mathematical convenience and empirical correctness. For I now maintain that it applies 
to all my four examples. 

^Cf. also Kadanoff (2010, p. 20, Figure 5); Menon and Callender (2011) is a discussion of phase tran- 
sitions concordant with mine, here and in Section [71 You may weU ask: Is my mathematical example also 
a good prototype for dissolving the corresponding alleged mystery in physics' other 'singular' limits, e.g. 
from optics, semiclassical mechanics and hydrodynamics? My view is: Yes. For a masterly philosopher's 
survey of the first two cases, cf. Belot (2005, Sections 3, 4 and Appendix). 
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3.3.3 Developing the Straightforward Justification 

This Justification consists of two obvious, very general, broadly instrumentalist, reasons 
for using a model that adopts the limit N = oo: mathematical convenience, and empirical 
adequacy (upto a required accuracy). So it also applies to many other models that are 
almost never cited in philosophical discussions of emergence and reduction. In particular, 
it applies to the many classical continuum models of fluids and solids, that are obtained 
by taking a limit of a classical atomistic model as the number of atoms tends to infinity 
(in an appropriate way, e.g. keeping mass density constant). 

'Mathematical convenience and empirical correctness': merits that are so easy to state! 
But as all physicists know, and as echoed in the companion paper's discussion of good 
variables and approximation schemes: both can be very hard to attain — indeed, most of 
a physicist's work with a model is devoted to attaining them! But «/they are attained 
by adopting the limit N = oo, they surely justify using the limit. (At least, they do so, 
once we have disposed of any suspicious threat of mystery, such as refuting the atomic 
constitution of matter!) 

Though the details vary widely among the countless models adopting some N = oo 
limit, this justification involves two themes that are common to so many such models 
that I should articulate them. The first theme is abstraction from finitary effects. That 
is: the mathematical convenience and empirical adequacy of many such models arises, at 
least in part, by abstracting from such effects. Consider (a) how transient effects die out 
as time tends to infinity; and (b) how edge/boundary effects are absent in an infinitely 
large systemic 

The second theme is that the mathematics of infinity is often much more convenient 
than the mathematics of the large finite. The paradigm example is of course the conve- 
nience of the calculus: it is usually much easier to manipulate a differentiable real function 
than some function on a large discrete subset of IR that approximates it^ I shall just 
spell out two advantages which are endemic. We can begin with the simple case where we 
consider just the limit of the values, i.e. (i) of Section [221 so we set aside for the moment 
the infinite model, (ii) of Section 13.21 

Thus consider a model in which the actual value of the relevant quantity for realistic, 
i.e. large but finite, A^, say = 10^^ — the value f (/(lO^^)) in Section I3.2f s notation, 
taking the state as understood — is negligibly close to the limit limAr^oo '^ifi^))- And let 
us assume that the value will remain close as grows: so the values obey f (/(lO^'^)) ~ 

^'^As to (a), it is worth recalling the witty definition, attributed to Feynman, of that (invaluable but 
much-contested!) concept, 'equilibrium': 'the state the system gets into after the fast stuff [e.g. relaxation, 
transients] is finished and the slow stuff [e.g. Poincare recurrence] has not yet started'. For apart from 
being witty, the mention of 'the slow stuff' echoes Section [2ls warning (4:Unreal). That is: we should 
beware that for very large times (not just for very large N) physical theories and models often become 
unrealistic. And as to both (a) and (b), recall also footnote lUs idea of intermediate asymptotics. Thus 
Feynman's witty definition should be revised along the lines 'the state the system gets into after both 
the really fast stuff, and the intermediate stuff, is finished and ...'. 

^^But smoothness is not everything! In some cases, as we saw with Section [3. 3. 2[ s goo, a discontinuous 
function is more convenient than a continuous one. 
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v{f{10'^^)) ~ f (/(lO^^)) etc. Working with the hmit rather than the actual value promises 
two advantages. (Here of course we set aside Section [2Is warning (4:Unreal), that for many 
models, the values for vastly larger N will eventually be unrealistic.) 

The first is that it may be much easier to know, or at least estimate, the limit's value 
than the actual value — not least because of the first theme, the abstraction from finitary 
effects. And ex hypothesi, working with it involves a neghgible inaccuracy about the actual 
value. 

The second advantage is more theoretical, and will lead back to Section [3l2l s (ii), i.e. 
the value of a limit quantity on an infinite system. The idea here is that for most models 
and quantities /, there is, for a fixed A^, not a single value v{f{N)), but a range of values, 
to be considered. That is: v{f{N)) is a function of some other variable which has so 
far been suppressed in my notation. And to make this function easily manipulated, e.g. 
continuous or differentiable so that it can be treated with the calculus, we often need 
to have each value of the function be defined as a limit (namely, of values of another 
function) . 

Continuum models of solids and fluids provide paradigm examples of this. For exam- 
ple, consider the mass density varying along a rod, or within a fluid. For an atomistic 
model of the rod or fluid, that postulates A^ atoms per unit volume, the average mass- 
density might be written as a function of both position x within the rod or fluid, and the 
side-length L of the volume centred on x, over which the mass-density is computed: 
/(A^, X, L). Now the point is that for fixed A^, this function is liable to be intractably 
sensitive to x and L. In particular, if atoms are or contain point-particles the function 
will jump when L is varied so as to include or exclude one such particle. That is: it will 
not be continuous in x and L. But by taking a continuum limit A^ — )■ oo, with L — t- (and 
atomic masses going to zero appropriately, so that quantities like density do not "blow 
up"), we can define a continuous, maybe even differentiable, mass-density function p{x) 
as a function of position — and then enjoy all the convenience of the calculus. 

So much by way of showing in general terms how the use of an infinite limit N = oo 
can be justified — but not mysterious! At this point, the general philosophical argument 
of this paper is complete! The subsequent Sections present my examples. It will be clear 
that each example represents a large field of study. So to save space, I will have to be 
brutally brief, both about the examples' details and about references. 

4 The method of arbitrary functions 

My first example is the method of arbitrary functions in probability theory. It is a vener- 
able tradition, initiated by Poincare in his Calcul de Probabilities (1896), and developed 
by many authors including Borel, Frechet and Hopf. Recent presentations include En- 
gel (1992) and Kritzer (2003); and von Plato (1983, 1994, pp. 168-178) summarizes the 
history. But until recently it seems to have been largely neglected in the philosophy of 
probability, despite its offering an attractive way to reconcile non-trivial probabilities (i.e. 
probabilities that are neither nor 1) with determinism at an 'underlying' level — and 
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despite being the topic of Reichenbach's dissert at ion 

The main idea of the method is best introduced by an example, and I will follow 
Poincare (and most discussions) in choosing a roulette wheel, with alternating arcs of red 
and black (Section 14.11) . Thus we will be concerned with the probability that the wheel 
stops with a red (respectively, black) arc opposite a pointer. For this example, the main 
idea will be that under certain assumptions, this probability tends to 0.5, as the number 
of arcs goes to infinity — whatever the details of the spinning and slowing of the wheel. 
Section 14.11 will also discuss how this result can be generalized. Then in Section 14.21 I 
describe how this equiprobability in the limit iV — )■ oo counts as emergent behaviour in 
my sense, and how it illustrates my claims, (l:Deduce) etc. 

4.1 Poincare's legacy 
4.1.1 Poincare's roulette wheel 

Suppose that a roulette wheel with arcs of red and black is spun many times, eventually 
coming to a stop with a red or a black arc opposite a pointer. We suppose that it is spun 
using various unknown initial conditions, i.e. initial positions relative to the pointer and 
initial angular velocities; and that it is slowed and eventually stopped by some unknown 
regime of friction. If this is all we know, we can conclude essentially nothing about the 
long-run frequency (or probability, in any sense) of it stopping at Red (i.e. with a red 
arc opposite the pointer). For the variety of initial conditions and the regime of friction, 
taken together, amount to an unknown profile of biassing. This profile might be expressed 
as a function giving, for each arc, the probability of the wheel stopping there. And for all 
we have so far assumed, this function might make Red very probable (frequent) — or very 
improbable (infrequent). 

But suppose we also assume that: 

(i) : there are very many alternating arcs of red and black; 

(ii) : whatever the unknown profile of biassing might be, it favours and disfavours 
large segments, i.e. segments each of which contains many red and many black arcs; 

(iii) : within one of these large segments, the bias is not too "wiggly" in the sense 
that two adjacent arcs get nearly equal biasses. 

Then we can be confident that the long-run frequency of Red (and of Black) is about 
50%. For assumptions (i) to (iii) mean that if the profile is expressed as a probability 
function, each of its peaks (corresponding to a favoured segment) contains many red and 
many black arcs — and so do each of its troughs (corresponding to a disfavoured segment). 
Thus the contribution of any peak to the overall probability (or frequency) of stopping at 
Red will be about equal to the peak's contribution to the probability of stopping at Black; 
and similarly for any trough. So summing over all the peaks and troughs, the honours 

^^I say 'until recently' for two reasons. First: Strevens (2003) has revived the main idea; though he 
is wary of the philosophical value of theorems about limiting behaviour, which figure prominently in the 
tradition and which I will emphasize. For assessments of Strevens, cf. Colyvan (2005) and Werndl (2010). 
Second: some recent papers revive the main idea: Sober (2010), Frigg and Hoefer (2010), Myrvold (2011). 
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will be about even between Red and Black: there will be approximate equiprobability. To 
sum up: (i) to (iii) imply that the idiosyncrasies of the biassing profile get washed out. 

This is a beautiful and compelling idea; (originally due, apparently, to an 1886 book 
by von Kries; cf. von Plato 1983, p. 38; 1994, p. 169). Expressing it in general and 
probabilistic terms, we expect the following. Let a sample space (X, /i) be partitioned 
into two subsets, say R and 5, in a very "intricate" or "filamentous" way. Then for any 
probability density function / that is not too "wiggly" (say: whose derivative is bounded: 
I /' |< M) the probabilities of R and B are about equal: 

[ fdfx ^ [ fdfx ^ (4.1) 

And we expect: that, for any bound M on the derivative of the density /, as the partition 
becomes more intricate or filamentous, the difference from exact equiprobability (and so 
to both probabilities equalling i) will tend to 0. 

Indeed, Poincare (1912, p. 148ff.) turned this idea into a theorem, for a simple model 
of the roulette wheel. So we take X to be the circle [0, 27r], and the intricate partitioning 
of X to be the division into N equal intervals, labelled alternatingly 'red' and 'black'. 
We assume the distribution of the point x E X at which the wheel stops (i.e. which is 
eventually opposite the pointer) is given by a probabihty density function / : [0, 27r] — )■ IR. 
We assume that / is differentiable, and its derivative is bounded by M, i.e. | /' |< M G IR. 
This of course makes precise assumptions (ii) and (iii) abovejlfl Then Poincare showed: 

For any M G IR, for all density functions / with derivative bounded by M, 

I /' |< M: a.s N = the number of arcs goes to infinity: 

Jj^ f = prob(Red) — )■ | ; and f djji = prob(Black) — i- |. 

To sum up: any biassing profile, no matter how wiggly, i.e. sensitive to the wheel's angular 
position (no matter how large M), can be washed out, so as to give equiprobability up to 
an arbitrary accuracy, by a sufficiently intricate partition, i.e. by a sufficiently large N. 

4.1.2 Generalizations: statistical stability 

Subsequently, Poincare's theorem was generalized in two main ways. The first way was 
historically earlier and is less connected to later developments, especially of probabilistic 
methods in the study of dynamical systems. But it is easier to report since its conception 
of the parameter N is very close to Poincare's original: it measures the fineness of the 
partition of the sample space. In the second way, on the other hand, one takes a different 
limit, usually depending on the details of the dynamical system concerned. 

I will now sketch both ways. But as regards illustrating my claims about emergence, 
I should stress the following points. 

^•^We might also assume that the support of / intersects all N cells of the partition. This is one 
way (among several) to represent the natural requirement that the wheel is spun fast enough, at least 
sometimes, to prevent it stopping after just a few arcs have passed the pointer. 
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(a) The illustrations do not need any of these generalizations; so the reader uniniter- 
ested in probability theory can now skip to Section 14.21 

(b) The first way leads to illustrations of my claims that are exactly parallel to the 
original illustration given by Poincare's theorem: a happy circumstance, since it supports 
my view that my claims have a wide validity. 

(c) The second way also illustrates my claims. But because a different, and even 
system-dependent, limit is taken, these illustrations are rather different from the Poincare 
original. So to save space, I will not pursue the details. 

(d) Poincare's theorem and its generalizations (in both ways) are very suggestive for 
the philosophy of probability. As we will see, they hint that even with an underlying 
determinism, taking an appropriate limit can define non-trivial probabilities that are "ob- 
jectively correct". But again, to save space, I must make a self-denying ordinance about 
this. 

The first way generalized the assumptions of the model of the wheel, and adapted 
them to other chance set-ups. At first the conditions on the initial density function / 
were weakened, by authors such as Borel and Frechet. In short, Borel assumed merely 
that / was continuous; and Frechet merely that it was Riemann-integrable. 

As to other chance set-ups, one paradigm example, which had the merit of extending 
the method of arbitrary functions to densities of more than one variable, was Buffon's 
needle. In this problem a person throws a needle of length / on to a table on which a 
pattern of parallel lines at a distance d {d > I) has been ruled. One asks: what is the 
probability that the needle lands so as to intersect one of the lines? The elementary 
treatment assumes that the point where the centre of the needle lands has a uniform 
probability density (in the interval [0,(i] for simplicity); and similarly that the angle 
between the needle and the lines is uniformly distributed. It then follows by an elementary 
argument that the probability of intersection is 21/ dir. 

But it is more realistic to assume that there is some unknown ("arbitrary") density 
function, perhaps peaked near the centre of the table, for the point where the centre of 
the needle landslip Can we again apply von Kries' and Poincare's idea that a more and 
more intricate partition of the sample space (here, the table) will wash out the influence 
of the peaks (and troughs) of the unknown density function? Yes! Borel indicated, and 
Hostinsky showed in detail, that one can recover the familiar answer, 21/ dn, by taking the 
limit as the number of lines on the table goes to infinity. For this theorem, Hostinsky 
assumed that the partial derivatives of the density function exist, are continuous and are 
bounded. And he takes the limit, N oo, while (i) the table size is constant, so that the 
lines' separation d goes to zero, and (ii) the ratio l/d is constant. 

At this point, we must concede that the theorems reported so far have an obvious 
limitation: the limit, N — )■ oo, is unrealistic. The number of arcs on a roulette wheel, 
and the number of parallel lines on any table, is in fact fixed. (So this sense of being 
unrealistic is more straightforward, and in practice arises for much smaller N, than the 
idea of running up against the atomic constitution of matter, involved in my (4: Unreal) of 



^^Similarly, one might say, for the angle at which the needle lands. But I will not pursue how to relax 
this assumption. 
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Section 121) Can we respect this fact, and yet still apply our initial idea that an intricate 
partition of the sample space washes out the influence of the peaks and troughs of an 
unknown density function? 

As I see matters, there are two broad strategies one can adopt. Both are important; 
and fortunately, they are compatible. The first strategy is piecemeal, and takes no limits. 
One models each chance set-up as realistically as one wishes or is able to; and then 
calculates, perhaps numerically, how wiggly (in some sense) the density function could 
be, while yielding approximately the probabilities we observe and-or desire — e.g. for the 
roulette wheel, equiprobability of Red and Black. This strategy is obviously sensible; and 
in Section [4.2.21 we will see how it illustrates my claim (2:Before). But for now, I turn to 
the other strategy. 

This is what I called the 'second way' of generalizing Poincare's theorem. In short: 
to derive the observed or desired probabilities, a different limit is taken. This strategy 
can also be piecemeal: the details of the chance set-up suggest what limit to take. I 
shall briefly report two impressively neat examples of this: Hopf 's analysis of the roulette 
wheel, and Keller's analysis of coin-tossing. Then I shall report how this second way leads 
to the important idea of statistical stability. 

Hopf 's idea is that for a roulette wheel with a fixed number of arcs, the equiprobabil- 
ity of Red and Black will follow from allowing higher and higher initial angular velocities. 
Thus the basic insight is that even with fixed, a higher initial angular velocity implies 
that the width of an interval of velocities that lead to a specific arc stopping opposite 
the pointer is smaller. Or to make the same point at the opposite extreme: with just 
a few arcs (say, two!), and initial angular velocities so small that at most one rotation 
occurs, even a ham-fisted croupier can fix the wheel, i.e. guarantee stopping at Red, or 
at Black. In more detail: Hopf considers the total angle 6* G [0, oo] through which some 
fiducial point on the wheel's circumference turns before the wheel stops. Higher initial 
angular velocities will make 6 larger; and Red or Black is determined by 6 mod 2n. The 
regime of spinning and friction is summarized in an unknown density function / on the 
initial angular velocity u, with bounded support. But higher velocities are considered 
by translating / by a constant C, i.e. by defining f*{oj) := f{uj — C); and by letting 
C — > oo. Hopf also allows the frictional force (the braking) to depend, not only on the 
present angular velocity, but also on the angle so far turned through; that is, he allows 
for an unbalanced wheel. Hopf then proves that as C — ?• oo, the distribution of 6 mod 2n 
tends to being uniform on [0, 27i]. 

Keller gives a broadly similar analysis of coin-tossing (1986; developed by Diaconis et 
al. 2007). He takes the coin to be a circular lamina which is initially horizontal: it is 
tossed in a vertical line with an initial angular velocity u and initial vertical velocity u, 
and falls under gravity onto a horizontal table where it settles with either Heads or Tails 
facing upward. Like Poincares or Hopf 's wheel, the sample space of initial conditions is 
intricately partitioned into subsets that lead eventually, and deterministically, to Heads 
or to Tails. But like Buffon's needle, the sample space is two-dimensional. It is the 
positive quadrant of the (w, M)-plane. So the probabilities of Heads and Tails are given by 
integrating over the Heads and Tails subsets, respectively, an unknown density function 
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f{u;,u), which Keller takes to be continuous. 

Keller shows that the pattern of Heads and Tails subsets is like a "hyperbolic zebra" . 
Each subset is a thin strip lying along one of a series of hyperbolas, i.e. curves like 
CO = nK/u with n a natural number. Besides, Heads strips alternate with Tails strips; 
and for higher values of uj and u (i.e. as we move North-East in the positive quadrant), 
the strips become thinner. This means that, in the now- familiar way, the integral of /, for 
Heads or for Tails, over these higher values becomes less sensitive to wiggles in /. That is: 
as the support of / (or even just /'s "preponderant weight") tends "North-East", Heads 
and Tails tend towards being equiprobable — whatever the density function. 

Agreed, you might object that these analyses of Hopf and Keller, though neat, are 
again unrealistic. No roulette wheel is spun, and no coin is tossed, arbitrarily fast! But 
the reply is clear. It has two parts. Analyses like Hopf 's and Keller's can give information 
about the speed of convergence towards their limit; and this can reassure us that realistic 
initial conditions lead to the desired probabilities (here: equiprobability) , up to a good 
accuracy, for a wide class of density functions. Here of course we return to two previous 
themes: 

(i) in general terms, the two merits of Section I3.3.3f s Straightforward Justification of 
taking a limit: mathematical convenience and empirical success; and 

(ii) specifically, the value of modelling without taking a limit, i.e. the first strategy 
above, and my claim (2:Before). Recall my remark above that the two strategies are 
compatible. 

Finally, Hopf 's and Keller's analyses prompt the idea of statistical stability, which has 
been very important for the probabilistic study of dynamical systems. I will not go in to 
the measure-theoretic technicalities (about absolute continuity and types of convergence) 
that are needed for an exact definition, but just convey the main idea. (This occurs, 
under the label 'statistical regularity' in Hopf's own analysis of the roulette wheel.) The 
general scenario is that we are given: (i) two probability spaces (X, /i) and {Y.i'), i.e. 
/i, z/ are probability measures on appropriate fields of subsets of X, Y respectively; (ii) a 
family of maps F\ : X ^ Y , labelled by a parameter A G IR or perhaps G N. Thus in our 
examples above, X was the space of initial conditions and Y was the two element space 
{ Red, Black } or { Heads, Tails }; and each Fx is a deterministic map sending an initial 
condition x G X to an outcome y & Y . 

Returning to the general scenario: fix := fi o F^^ is a probability measure on Y, and 
we can ask whether there is a measure on Y to which fix converges as A — )■ oo: or even a 
measure on Y to which fix converges, for all fi on X in some suitable class. If so, we say the 
family Fx is statistically stable. In studying complicated, even "chaotic", deterministic 
systems, this idea has an important special case: namely, X = y, /i = z/, AgN and 
the family Fx arises just by iterating a map T : X — )• X, i.e. Fx := T'^ represents a 
discrete-time evolution. In this case, the limit measure, fi* say, characterizes the long- 
time statistical behaviour of the system. In particular, it is readily shown to be invariant 
under the time-evolution. That is, T induces an evolution Pt on measures (and their 
densities) in the natural way, and we have: Pr(yU*) = f^*- 
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4.2 The claims illustrated by emergent equiprobability 

I turn to describing how the hmiting probabihties of Section 14.11 count as emergent be- 
haviour in my sense, and how they illustrate my claims (l:Deduce), (2:Before) and (3:Her- 
ring) (listed in Section [L2|) . As I announced, I will for simplicity emphasize the original 
Poincare theorem, stated at the end of Section I4.1.1I But it will be clear how the claims 
are also illustrated by the generalizations given in Section I4.1.2[ including the closing idea 
of an invariant limit measure fi*. 

The illustrations unfold immediately, once we stipulate that the limiting probabilities 
are to be the emergent behaviour. For me, this means behaviour that is novel or surprising, 
and robust, relative to a comparison class. As discussed in the companion paper, this class 
is liable to be fixed contextually, and even to be vague or subjective — but nevermind, 
since there does not need to be an exact meaning of 'emergence'. Here I concede that the 
limiting probabilities, especially the equiprobability of Red and Black, or Heads and Tails, 
are not novel or surprising — though I submit that it is surprising that one can deduce 
them from an arbitrary density function. In any case, they are robust in a vivid sense: 
the whole point of the method of arbitrary functions is that they are invariant under a 
choice of a density function from a wide class. 

4.2.1 Emergence in the limit: with reduction — and without 

As to (l:Deduce): we have 'reduction as deduction' in as strong a sense as you could 
demand — provided we take the limit. Thus for Poincare's theorem, we take Tf to be just 
the statement of equiprobability in the limit of infinite A^, and to be a model of the 
wheel, including enough measure theory and calculus to cover both: (i) the postulation of 
various possible density functions / on [0,27r]; and (ii) consideration of the infinite limit 
N ^ oo. And similarly for Section HTTT s other examples. 

(l:Deduce) also concerns "the other side of the coin": how the emergent behaviour, 
here equiprobability, is not deducible, if we do not take the limit but instead confine Th 
to finite A^. This also is illustrated by Section 14.11 Thus in particular, for Poincare's 
roulette wheel: For any finite A^, no matter how large, equiprobability will fail, as badly 
as you may care to require, for a sufficiently "wiggly" density function, i.e. a sufficiently 
position-sensitive biassing regime. That is, we have: 

For all e > 0, for all positive integers A^, there is M G IR and a density function 
/ with I /' |< M such that: f dfi = prob(Red) > 1 — e. 

So here is emergence without reduction to a weaker finitary T^. Since this weaker Tb is a 
salient theory, one can be tempted to speak of irreducibility. Similarly for Section I4.1f s 
other examples. 

It is worth displaying the two sides of (l:Deduce)'s "coin" — equiprobability 's deducibil- 
ity in the limit, and its non-deducibility before — in terms of a shift of quantifiers. Thus 
the "form" of Poincare's theorem is: 
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Ve > 0, VM e ]R, V/ with \f'\<M,3N s.t. VA^* > A^: | J^.^.^rcs f^f^'ll 

while "the other side of the coin" is: 

Ve > 0, VAT, 3M G H, and / with | /' |< M, s.t.: | /^.jvarcs f - W > ^■ 

One can easily check that in Section I4.1f s other examples, including Buffon's needle, 
Hopf's roulette wheel and Keller's tossed coin, the two sides of (l:Deduce)'s "coin" involve 
a similar quantifier-shift. 

Finally, I stress the point announced in Sections 11.11 and 13.11 that the limits we are 
concerned with are in no way singular — so a singular limit is not necessary for emergence. 
Nor is there any infinite system corresponding to = oo (i.e. cr(oo) in Section I3.1f s 
notation). For the roulette wheel, that would mean a division of [0, 27r] in to a denumerable 
number of equal-length segments! And similarly for the other limits: e.g. Hopf's roulette 
wheel spun, or Keller's coin tossed, with an infinite initial angular velocityO 



4.2.2 Emergence before the limit 

(2:Before) claims that before the limit, there is emergence in a weaker but still vivid sense. 
Here the weaker sense is approximate rather than exact equiprobability, for some realistic 
model of the roulette wheel (or other chance set-up). So we already saw in Section 14.1.21 
how the method of arbitrary functions illustrates this claim: namely, in the discussion 
of the finite parameter case, both (i) as a first strategy for defending Poincare's roulette 
wheel and (ii) as a reply to the parallel objection to Hopf or Keller, that no wheel is spun, 
no coin is tossed, arbitrarily fast. For both (i) and (ii), we calculate, perhaps numerically, 
how wiggly (in some sense) the density function could be, while yielding approximately 
the probabilities we observe and-or desire — e.g. for the roulette wheel, equiprobability of 
Red and Black. 

Speaking of desire raises issues of engineering: indeed, of the profitability of casinos. 
We know that casinos manage to get profitably close to equiprobability, with some small 
number, A^ fsi 50, of arcs. And we surmise that even if they had a worryingly wiggly 
/, they could get profitably close to equiprobability by putting A^ up to say about 200; 
or — following Hopf's idea — by spinning the wheel, on average, some two to three times 
faster. Here we meet the multi-faceted, even interest-relative, even subjective, question: 
how close is close enough? 'Close enough for all practical purposes': but what exactly are 
the practical purposes? How wiggly an / need the casino guard against? 

But 1 submit that this is a question for casino-owners — who can no doubt pay staff 
well enough to answer it accurately for them. At our (typically philosophical!) level of 

^^I said there can be no division of [0, 27r] in to a denumerable number of equal-length segments. No 
sooner said than doubted — as so often in philosophy. I am grateful to Alan Hajek for pointing me to 
Edward Nelson's adaptation of the ideas of non-standard analysis to probability theory; of. Nelson (1987, 
especially Chapters 4 to 7). 
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generality, we do not need to try and answer it. For us, it is enough that given a resolution 
of this and similar questions, including vaguenesses, we get a notion of approximate 
equiprobability, which can indeed be deduced from a with parameters that are not 
only finite, but also realistic. In particular, can imply profitable — for the gamblers: 
indiscernible — closeness to equiprobability, using some N ^ 50 arcs on the roulette wheel, 
and an initial velocity of some lOvr to SOvr radians per second (5 to 15 revolutions per 
second) . 

4.2.3 Supervenience is a red herring 

I turn to my third, ancillary, claim (3:Herring). Namely: although various supervenience 
theses are true, they yield little or no insight into emergence, or more generally, into "what 
is going on" in the example. 

This is well illustrated by Poincare's roulette wheel, and Section H?T? s other examples. 
For any sequence of spins of the wheel, with any number N of arcs, and any regime 
governing its initial velocities, the frequency of Red is of course determined by, super- 
venient upon, all the microscopic details of the wheel and its many spinningscj This 
supervenience thesis holds for a finite sequence of spins; or an infinite one, with frequency 
defined as limiting relative frequency. And there are analogous supervenience theses for 
probability, rather than frequency: the probability of Red is determined by the details of 
the wheel, especially the choice of probability density function. Similarly of course, for 
coin-tosses, and the frequency or probability of Heads. 

I submit that these supervenience theses, whether for frequency or probability, shed 
no light on the matters at hand. For they make no connection with the basic idea of the 
method of arbitrary functions: that intricate partitions of a sample space can wash out 
the peaks and troughs of an unknown density function, and secure robust probabilities. 
This is a good illustration of my general reasons (in Section [L2!) for supervenience theses' 
irrelevance: that they make no connection between their idea of a variety, perhaps even 
infinity, of ways to have the higher-level property P, and the limit processes on which the 
example turns. Thus here, P is the property that a frequency or probability of Red (or 
of Heads) is |, or is the property of two events being equiprobable; and the example's 
limit processes are the number of arcs, or the initial velocities, going to infinity, so as to 
implement the basic idea of washing out peaks and troughs. Or we can eschew the limit 
and use only finite parameters, as in (2:Before). But again, these supervenience theses 
shed no helpful light 

^^In Section 1131 we assumed, for the most part implicitly, that these details were based on classical 
mechanics. But the same supervenience thesis would hold if we assumed instead that they were based on 
quantum theory. At least, this is so if we set aside the quantum measurement problem, which threatens 
to deny us any definite macroscopic events. The companion paper discusses some dangers in the idea of 
supervenience on the microscopic details, "whatever they might be" . 

^^A caveat. I agree that these supervenience theses are relevant to the philosophy of probability, 
especially for an empiricist. For example: if we maintain that the empiricist should accept the model's 
microscopic details, say because they are "occurrent", then the supervenience theses for frequencies 
support the idea that they should also accept frequencies — as a metaphysical free lunch, as people say. 
But in (d) at the start of Section H.l. 2 [ I foreswore the philosophy of probability: for some discussion in 
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5 Fractals 



My second example is fractals, or rather, one small aspect of this large field: namely, the 
idea that a set of spatial points, i.e a subset of IR" (n = 1,2, ...), can have a dimension 
that is not an integer. As we shall see, one can define various notions of dimension; and 
much of the discussion and results carry over to spaces more general than Euclidean space 
H". However, I will emphasise one notion of dimension, scaling dimension (also known 
as: similarity dimension), and confine myself to H". Even a very short introduction to 
this topic (Section 15. ip will be enough to illustrate my claims. For my first three claims, 
details are in Section 15.21 The discussion is similar to that in Section I4.2|[^ 

But as I mentioned in Section [21 I propose for fractals to also discuss my fourth claim 
(4:11 ureal): that for large but finite A^, the example becomes unrealistic — for reasons that 
are usually ignored in discussions of emergence. I do this in Section 15.31 This will mean 
that in Sections 15.11 and 15. 2^ the pure mathematics of dimension in Euclidean geometry 
will be prominent: the empirical world will come to the foreground only in Section 15.31 
For in this fractals example, large N corresponds to very small length-scales; so that here, 
(4:Unreal) amounts to a 'No' answer to the question 'Is fractal geometry the geometry of 
nature?' In other words: (4:Unreal) denies that fractal descriptions of physical objects 
are literally true: a denial which my first three claims can largely ignore. Section [53] will 
sum up. 

5.1 Self-similarity and dimension as an exponent 

The key innovation of fractals is to extend, from familiar geometric objects such as squares 
and cubes to much more "irregular" sets, two related ideas: (i) self-similarity and (ii) 
dimension as an exponent. 

Recall that a square with edge / is the union of P unit squares; e.g. a square whose 
edge is / = 3 units long is the union of 3^ = 9 unit squares. And a cube with edge / is the 
union of P unit cubes; e.g. a cube whose edge is / = 3 units long is the union of 3^ = 27 
unit cubes. These examples exhibit both the ideas (i) and (ii), as follows. 

(i) : The square or cube is a union of smaller copies of itself; and the decomposition 
involved can be iterated indefinitely — imagine repeatedly shrinking the unit of length / 
by some factor. 

(ii) : In the formula for the measure (area, volume) of the object (i.e. the number 
of unit building blocks in it), the dimension occurs as an exponent, and takes the same 
value, however fine the decomposition i.e. however small we choose the unit of length. 

number of unit blocks in object with edge / = /dimension of object _ ^g^^ 

the context of the method of arbitrary functions, cf. the papers by Frigg and Hoefer, and Myrvold cited 
in footnote [T^j 

should mention a reason for restricting attention to Euchdean space H". Namely: Euclidean 
geometry admits similarity (of triangles and other figures), while non- Euclidean geometries in general 
do not; and on our approach, the definition of fractals needs the idea of similar figures. Section will 
return to this point. 
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So the main idea of fractals is that similarly: — 

(i'): Some "irregular" sets of points are unions of smaller copies of themselves; where, 
again, the decomposition involved can be iterated indefinitely. Among these sets will be 
some famous examples, which were treated as "pathological" when first explored some 
hundred years ago: in particular, the Cantor 'middle thirds' set C which is a subset of 
the unit interval [0, 1] C IR (1872), and the Koch snowfiake K which is a subset of the 
unit square (1906). 

(ii'): Applying the idea of eq. IS.ll to such sets, we find that they have non-integral di- 
mensions. For example, the Cantor set has dimension about 0.63, and the Koch snowfiake 
has dimension about 1.26. 

These ideas are connected to my themes of emergence and reduction, owing to the fact 
that these sets are defined by taking a limit — )■ oo of an iterated process of definition. 
Thus in Section I will take non-integral dimension to be the emergent (i.e. novel and 
robust behaviour), which is deduced (and so reduced!) in the limit. 

I shall now develop ideas (i) and (ii), especially eq. 15.11 more formally. But how 
fractals illustrate my claims about emergence and reduction does not depend on these 
details, and the reader uninterested in geometry can now skip to Section 15.21 But I 
should also stress, on the other hand, that what follows is the merest glimpse of the 
modern theory of dimension. I shall rein in the exposition, and say only enough: (a) to 
define the scaling dimension, and see how it can be non-integral (Section I5.1.ip . and (b) 
to sketch how scaling dimension relates to other concepts of dimension (Section I5.1.2p . 

5.1.1 Examples: scaling dimension 

I begin by defining the Cantor set and Koch snowfiake. This will show that they are 
self-similar, i.e. unions of smaller copies of themselves; and this will imply that using eq. 
I5.1f s idea of dimension as exponent, both these sets have a non-integer dimension. Then 
I give a general definition of scaling dimension. 

5. 1.1. A: The Cantor set C: — This is defined as the intersection of infinitely many 
other subsets, which we will call 'stages', labelled 0, 1, 2,... The unit interval [0, 1] is stage 
0. After stage 0, each later stage is obtained by deleting the open middle third of each 
closed interval of its predecessor. So stage 1 is [0, 1], minus its open middle third. That 
is: stage 1 is [0, |] U [|, 1]- Then stage 2 is defined by deleting the open middle third of 
each of [0, |] and [|, 1]. So stage 2 consists of four disjoint closed intervals: it is the set 
[0, |] U [|, |] U [|, |] U [|, 1]. And so on. Thus stage N is the union of 2^ intervals, each 
interval being of length {\)^ ■ So the total length of stage A^ is 2^ x (i)^ = (|)^. So as 
A^ goes to infinity, the length of stage A^ goes to 0. 

C is defined to be the intersection of all the stages. Thus C contains those real numbers 
between and and 1 whose ternary expansion (i.e. using digits 0,1,2) has no digit 1: so 
C is uncountable. Agreed, C is hard to visualize! Its topological properties include: it is 
closed, it is nowhere-dense (i.e. its closure has an empty interior) and its complement is 
a dense subset of [0, 1]. 
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Now we apply to C the idea of eq. 15. 1[ Think of C as the unit block of "Cantor 
type". And observe that C is the union of two shrunken copies of itself, each smaller by 
a factor of 3. That is: one shrunken copy is built by applying the infinite 'delete and 
take intersection' process to [0, |], and the other shrunken copy by applying the process 
to [|,1]@ 

This observation can be reproduced at the next scale up. That is: we can define 
the "Cantor type" object of scale 3, call it C", as the set that results from applying the 
infinite 'delete and take intersection' process to [0,3], rather than to [0, 1]. Then just as 
our original C is the union of two shrunken copies of itself, each smaller by a factor of 3, 
so also is C . That is: C is the union of two unit-size Cantor sets. Now we apply the 
idea of eq. 15. ![ getting 

number of unit Cantor sets in Cantor object of scale 3 = 2 = adimension of C ^^ 2) 

Now we recall that for any logarithm base a, b = c^^'^^'^^/^^^a'^) ^ go that in the case of 
interest: 2 = 3(i°g2/iog3)^ Here we drop the suffix stating the base, since the ratio of 
logarithms is independent of the base. That is: the dimension of C is log 2/ log 3: which 
is about 0.63. 

5.1.1.B: The Koch snowflake K: — This also has an iterative construction. Roughly 
speaking: we erect smaller and smaller equilateral triangles in the middles of the sides 
of a polygon, and define K as the limit. Thus stage is an equilateral triangle. Stage 
+ 1 is constructed from stage by replacing each line segment of stage A^ by 4 line 
segments, each one-third the length of the original. It follows that the perimeter of the 
polygon grows without bound: if P is the perimeter of the initial triangle, then stage A^ 
consists of 3 X 4^ segments each of length P/3^~^^, so that its perimeter is (4/3) ^P. 

So this is different from the Cantor set in that K is not itself the union of similar 
smaller snowflakes. But each "side" of K is the union of four smaller similar curves, each 



smaller by a factor 3. So applying again the idea of eq. 15. we get: 

4 = s^imension of k dimension ofK=^-^^ 1.26. (5.3) 

log 3 

5. LLC: Scaling dimension defined: — With these examples as motivation, I proceed to 
a general definition. The main effort is in defining the preliminary notion of self- similarity. 
For in general we need to allow that the smaller copies (of which the object, i.e. set, we 
are concerned with is a union) overlap, i.e. have non-empty intersection. But we require 
them to overlap "minimally" in the sense that their intersection is of lower dimension — 
in the usual integer-valued sense! — than the copies themselves. Examples include: two 
continuous curves that have a finite set of points in common; two rectangles that have 
parts of the boundaries in common. 



^^This observation can be iterated "downward". C is also the union of 2^ shrunken copies of itself, 
each smaller by a factor of 3^. And C is the union of 2^ shrunken copies of itself, each smaller by a factor 
of 3^; and so on... for each TV, C is the union of 2^ shrunken copies of itself, each smaller by a factor of 
3^. 
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For the moment, I will take the usual integer- valued notion of dimension for granted; 
(Section l5.1.2l will rehearse a standard definition of it). Then we say that a set X C IR" is 
a almost- disjoint union of two sets Y, Z iS X = Y U Z and YdZ has lower dimension than 
the dimensions of Y and Z. One similarly defines almost-disjoint unions of more than 
two sets. And one defines X to be self-similar if it is an almost-disjoint union of shrunken 
copies of itself. Here 'shrunken copies' can be made precise by using the vector space 
structure of IR": (i) to scalar- multiply the vectors in the set X by a common contraction 
factor, and (ii) to translate the resulting shrunken copies out of coincidence with one 
another, so as to give an almost-disjoint union. Thus we say: X is self-similar if it is the 
almost-disjoint union of m copies of X, each contracted by a common factor k, and then 
translated by a (non-common) vector v. Thus in an obvious notation 



Then we define the scaling dimension of X to be: log m/ log A;. 
5.1.2 Generalizations: three other concepts of dimension 

Our definition of scaling dimension, eq. 15.41 is limited to exactly self-similar objects. But 
the idea that a dimension occurring as an exponent in a power law can be non-integral 
can be developed for much more general kinds of object. These include: (i) allowing 
the contraction factor for the building-block set X to be anisotropic (called 'self-affinity', 
instead of 'self-similarity'); and (ii) introducing probabilities governing the contractions 
and-or translations of X, so that one considers an ensemble of random fractals, almost 
all of which are not exactly self-similar. 

These developments have both empirical and theoretical aspects: which have of course 
influenced one another over the years. In this Subsection, I round off our glimpse of the 
modern theory of dimension by sketching some of these developments: first the empirical, 
then the theoretical. There will be a common key idea: to substitute for Section IS.l.lf s 
contractions of a figure, the complementary idea of contracting a grid of lines (or planes 
or hyperplanes), or something analogous to such a grid, like a family of boxes or discs 
that appropriately cover the figure. 

5. 1.2. A: Empirical aspects: — Countless empirical studies have found power law be- 
haviour with a dimension as a non- integral exponent. One famous example is Richard- 
son's (1961) discussion of measuring the length of a coastline by traversing it from point 
to point, as if with a pair of dividers. Richardson envisages indefinitely improving the 
resolution, i.e. reducing the divider-distance. For a continuous curve, we would have the 
familiar limit: as the resolution length (divider-distance) 5 — )■ 0, the number of steps n{S) 
needed to traverse the coastline grows unboundedly in such a way that the estimate of 
the length, n{6) 6, tends to I: where / is the usual length of the curve, given by calculus 
as Z = J ds. We can express this as a power law with the curve's dimension D = 1 as an 
exponent. Namely, we would have: 



X = ur=i ilx + v.] . 



(5.4) 



n{6) ^ 1/6 = 1/6^ = 1/6^ with D = 1. 



(5.5) 
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But applying the dividers method to ever-larger scale maps suggests instead that as 5 — )■ 0, 
the estimated length n{S) S increases without bound, i.e. n{6) ~ constant x with D 
strictly greater than 1. This is of course like Section [B.l.lf s discussion of the length, and 
the dimension, of (the side of) the Koch snowfiake: except that intuitively a coastline has 
bays as well as promontories — concave portions as well as convex ones. But this can be 
modelled using random fractals, as mentioned in (ii) above. 

There are many other examples of such empirical power laws: often, as in this example, 
with the quantity of interest, / say, proportional to a power of a resolution 6: f = 
constant x 6^. In many cases, of course, the exponent represents, not length, area or 
volume, but some other physical quantity. But there are also plenty of cases where the 
exponent is a dimension (in our sense, not the more general sense of 'physical dimension' !). 
Thus Brady and Ball (1984) studied the dendritic growth of copper electrodeposited on 
to an initially pointlike cathode. They found that the volume (or mass) of copper was 
proportional to R^, where R is the radius and D was about 2.43 — in good agreement 
with computer-simulations. 

5.1.2.B: Theoretical aspects: — For my purposes, the main point here is that the modern 
theory of dimension recognizes several different concepts, and of course includes many 
theorems relating the agreements and differences in the dimensions assigned to various 
sets. I shall sketch three such concepts. As I mentioned at the start of this Subsection, 
they share a common general idea: viz. successively finer covers of the set in question, or 
something analogous like successively finer grids of lines or (hyper)planes. 

I start, for the sake of completeness, with the traditional, i.e. topological, integer- 
valued notion. My other two notions are the Hausdorff dimension and the box dimension. 
They are like scaling dimension, not just in taking non-integral values; but also in the 
general underlying reason for this, viz. some quantity showing power law behaviour. 
Besides, the dimension they assign to an exactly self-similar set, as in eq. 15. 4^ is equal 
to the scaling dimension: viz., in the notation of eq. 15. 4^ logm/ \ogk (Falconer 2003, pp. 
xxiv, 129; Hastings and Sugihara 1993, p. 31, 34, 40). So they are generalizations of 
scaling dimension, in the clear sense that if a set X has a scaling dimension then it 
also has both Hausdorff and box dimension equal to D. But as I mentioned, each of these 
notions also applies to a much wider class of sets. Besides, they have in common that the 
power law behaviour occurs as a cover of a set, or something analogous like a grid of lines 
or planes, becomes finer. But they are inequivalent notions: for some sets, their values 
disagree. 

Topological dimension: This can be defined for a general topological space; but I re- 
strict myself to compact subsets of IR". There are various ways to motivate the definition. 
Among the clearest is to consider the task of covering the unit square with closed rect- 
angles in such a way that as few rectangles as possible have points in common. Suppose 
we cover the square with a lattice of rectangles; (so the square is their almost-disjoint 
union). Then a point at a corner of the lattice is in four rectangles. If instead we stagger 
the rectangles, giving a brick-wall pattern, then each point at a corner is in only three 
rectangles. On the other hand, it seems this arrangement cannot be improved — except of 
course by making the rectangles so that we only need two, or even one(!), to cover the 
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square. Similarly in three dimensions. A few minutes' reflection suggests that: (a) for cov- 
ering the unit cube with arbitrarily small closed rectangular solids, one can arrange that 
no point in the cube is contained in more than four solids; but (b) for sufficiently small 
solids, at least four solids have a common point. Similarly, one naturally conjectures, for 
unit hypercubes [0, 1]" C M": (a) [0, 1]" can be decomposed as an almost-disjoint union 
of arbitrarily small n-rectangles, in such a way that no more than n + 1 of them have a 
common point; but (b) n + 1 is the least such number, i.e. in any such decomposition of 
[0, 1]", there must be a point common to at least n + 1 of the n-rectangles. 

This prompts the following definitions. Let X C IR". Let U he a cover of X by finitely 
many sets, and let 5 > 0. U is called an 6- cover if each element of U has diameter less 
than 5. (The diameter diam [/ of a set U is sup^ \ x — y \.) The order, ord U, of U is 
the natural number m e N for which there is a point of X belonging to m elements of U, 
but no point belonging to m + 1 elements. Then we say that X has topological dimension 
m iff m is the least integer for which, for any S > 0, there is a finite closed 5-cover of X 
of order m + 1 . 

This definition is the beginning of a rich theory. In particular, one shows that it 
gives the intuitive verdicts about familiar sets of points: finite sets of points, lines, planes 
and sohds get dimensions 0, 1, 2 and 3 respectively; and so on for IR". One shows that 
dimension thus defined is a topological invariant. It is also easy to check that the Cantor 
set has topological dimension 0. 

Hausdorff dimension: The definition proceeds in two main steps. (1): We first sum the 
diameters, raised to a power s, of the elements of a cover of the set X, and consider the 
limit as the supremum of these diameters goes to 0. As the set X varies, we get for fixed 
s a function if*, which is (an outer measure, and thereby) a measure on an appropriate 
field of sets — which includes the Borel sets. (2): These measures, parameterized by s, 
have the curious property that for any given set X, the value H'^{X) is zero or infinity 
for most s. It is this property that yields the definition of the dimension. The details are 
as follows. 

(1) : Let X C H" and s > and 5 > 0. We define 

HI{X) = inf E^^i(diamC/i)* (5.6) 

where the infimum is over all countable 5-covers {Ui} of X. (One can check that Hg is 
an outer measure on IR".) Now we let 5 —> 0: 

H^iX) := hms^oHHX) = sup.^o^K^)- (5-7) 

This limit exists; but may be infinite because if| increases as 6 decreases. is an outer 
measure, and so restricts to a measure on the cr-field of if*-measurable sets. This includes 
the Borel sets, and the measure is called Hausdorff s- dimensional measure. 

(2) : For any X, H^[X) is clearly non-increasing as s increases from to oo. And if 
s <t, then HI{X) > S'-^Hl{X). This implies that if H\X) is positive, then H'{X) is 
infinite. So there is a unique value, dimj^(X), the Hausdorff dimension of X, such that 

H'{X) = oo if < s < diuiHiX) ; and H'{X) = if dimj^(X) < s < oo . (5.8) 
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A rich theory ensues: for its beginning, cf. Falconer (2003, Chapter 2). 

Box dimension: The idea is to count the minimum number N{6) of closed n-cubes of 
a given edge-length S that cover the set in question, and to consider the limit as 5 — )■ 0. 
In the now-familiar way suggested by the scaling dimension, logm/log/c, of eq. 15.41 the 
dimension is defined as 

hm^^o — i — r ■ (5-9) 
— logo 

The minus sign is needed to make the dimension positive, since log 5 — )■ — oo as 5 — > 0. 
In fact, we can work, equivalently and conveniently, with the smallest number of closed 
balls of radius 6. 

As to the conditions for the limit to exist, I here just recall that any sequence {a„} of 
real numbers has a lim inf and a lim sup (which may equal ±oo), defined as follows: lim 
inf an is the number a such that: (i) for all e > 0, a„ is eventually forever greater than 
a — e, i.e. \/e > 0, 3A^, VM > A^, a^/ > a — e; and (ii) for all e > 0, the sequence forever 
returns to being less than a + e, i.e. We > 0, VA^, 3M > A^, qm < a + e. The requirements 
(i) and (ii) imply that such a number a is unique; and if there is no such real number, we 
set lim inf a„ = — oo. Similarly for lim sup. (One could summarize in topological jargon 
by saying that lim inf is the (possibly infinite) smallest of the sequence's accumulation 
points; and lim sup a„ is the (possibly infinite) largest of its accumulation points.) So for 
any bounded set X C H", we can define the lower and upper box dimension by 

<limg{X) := hmmf5_^o ] ^ ; dim^iX) := lim sup5__,o —; (5.10) 

— log — log 

and then we say that if these values are equal, that value is X's box dimension =: dimB{X). 

Again, a rich theory ensues (Falconer 2003, Chapter 3; Barnsley 1988, Chapter 5). For 
example: (i) familiar "regular" sets like points, lines and planes have box dimension equal 
to their topological dimension; (ii) for any set X, dimi^(X) < dim^(X) < dim^(X); and 
(iii) for a wide class of sets, the box and Hausdorff dimension are equal — but the box 
dimension has the advantage that it is often easier to calculate. 



5.2 The claims illustrated by emergent dimensions 

I turn to describing how the non-integral dimensions of Section 15.11 count as emergent 
behaviour in my sense, and how they illustrate my claims (l:Deduce), (2:Before) and 
(3:Herring) (listed in Section [L2|) . As I announced, the illustrations do not need all the 
details, especially of Section [5.1.21 To keep things simple and brief, I specialize to sets like 
the Cantor set and Koch snowflake (Section l5.1.ip that are defined by taking a limit of an 
iterated process of definition. Then the illustrations unfold immediately, once we stipulate 
that having a non-integral dimension is to be the emergent property or behaviour: i.e. 
novel (or surprising) and robust, relative to a comparison class. 

Certainly, non-integer dimensions are novel (more so than Section Hfs limiting prob- 
abilities). And they are 'robust' in at least two senses. First, the scaling dimension of 
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Section [5 . 1 . 1 1 obviously takes the same value for congruent sets of points, and for enlarged 
and reduced versions of a given set: this invariance is a kind of robustness. Second and 
more interesting: as discussed in Section 15.1. 2[ there are various novel notions of dimen- 
sion which can take non-integer values, and which are "cousins" of each other in various 
ways. They share the ideas of dimension as an exponent, and of taking successively finer 
covers or grids; and for wide classes of sets, their values agree. In particular, the values 
of Section IS.l.lf s scaling dimension are endorsed by Section I5.1.2f s Hausdorff and box 
dimension. So indeed it is fair to talk of 'emergent dimensions'. 

5.2.1 Emergence in the limit: with reduction — and without 

As to (l:Deduce): we have 'reduction as deduction' in as strong a sense as you could 
demand — provided we take the limit. The general situation is that at stage = 0, a 
"regular" set is given. Here "regular" can mean various things depending on the context, 
but I will take it to always imply having a well-defined topological dimension. Another 
set is then defined, yielding stage A^ = 1, by a process that can be iterated to give sets at 
stages A^ = 2, 3, .... At all finite stages, the defined sets are regular. And for a wide class 
of cases (including Section [S.l.lf s Cantor set C and and Koch snowflake K), the stages' 
dimensions are all equal — and is the integer you would expect. For example, at stage A^ 
for the Cantor set C, the defined set. Cat, is a union of closed sub-intervals of the unit 
interval; and its topological dimension is 1, as you would expect. Similarly for the stages 
in defining K. But the "irregular" set is defined by taking the limit A^ — t- oo. In general 
it has a different topological dimension: thus dim(C) = 07^ dim(CAr) = 1. So topological 
dimension is not continuous in the limit; (footnote [3] notes how this shows discontinuous 
limits do not imply emergence). And more important for us: according to one or more of 
the novel notions of dimension (scaling, Hausdorff, box etc.), the set has a non-integral 
dimension. For example, C's dimension (according to all three notions) is about 0.63. 

Thus the non-integral dimension, the emergent behaviour, is indeed deduced (and so 
reduced!) in the limit. In terms of my mnemonic notations: (l:Deduce) is illustrated as 
follows. Take as Ti, the theory of scaling dimension, and-or one or more of its generaliza- 
tions like the Hausdorff or box dimension; and if you wish, include, as a sub-theory, the 
topological theory of dimension. Take as Tt the assignments of non-integral dimensions 
to sets like C, K\ (and if includes the generalizations, to other sets that are not exactly 
self-similar). Then clearly, we have reduction: contains Tt! (Or in terms of Section 
I3.3.2f s quantity / whose value, 1 or 0, records the presence or absence of the emergent 
property: f^ = l.) 

But there is "the other side of the coin" : the emergent behaviour is not deducible if 
we do not take the limit. Notice that the situation is a bit different from that for the 
method of arbitrary functions (Section l4.2.ip . There, all one needed so as to deduce the 
emergent behaviour was consideration of the limit. Here, one needs ideas that go beyond 
the topological notion of dimension — discontinuous though it is, in the limits concerned. 
One needs the idea of dimension as an exponent, as developed in scaling dimension or its 
generalizations. But notwithstanding this difference, the main point is that (l:Deduce)'s 
second claim holds true again. Namely: if is just the traditional theory of dimension. 
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there is no reduction; and because this weaker theory is sahent, it is tempting to speak of 
irreducibihty. 

Finally, note another contrast with the method of arbitrary functions. Section 14.2.11 
ended by noting that no roulette wheel has infinitely many arcs; nor is any wheel spun 
infinitely fast. In Section [S.lf s notation: there was no infinite system a{oo). But in the 
fractals example, there are such infinite systems — the sets C,K C H" etc. — and the whole 
discussion focusses on them. 

5.2.2 Emergence before the limit 

(2:Before) claims that before the limit, there is emergence in a weaker but still vivid 
sense. It is illustrated in a manner parallel to the method of arbitrary functions. Thus 
recall Section |4. 2. 2f s discussion of approximate equiprobability in, for example, a casino's 
roulette wheel. For fractals, the obvious analogue of the wheel is a computer running 
some software so as to produce a simulation of some fractal set, by iterating the steps 
of its definition some finite number of times. The most obvious case is computer 
graphics software, producing an approximate or coarse-grained image of a fractal set. 
Nowadays, such images are ubiquitous in films and games, superseding the static images 
in yesteryear's lavish books (e.g. Peitgen and Richter 1986). 

It is easy to check that all of Section I4.2.2f s discussion — about how one can calcu- 
late, perhaps numerically, how closely a set-up approximates equiprobability, and how we 
philosophers can leave it to the casino-owners to worry about how close is close enough 
to be indiscernible by prospective gamblers — carries over to fractals, mutatis mutandis. 1 
will save space by not spelling this out. In short: what was said there, about the practical 
purposes of a casino in making a wheel fair enough that a gambler cannot profit from 
assiduously observing its long-run statistics, carries over here to the practical purposes of 
a film studio in making a simulated image look fractal at spatial scales so small that even 
the most hawk-eyed cinema-goer cannot see that it is in fact not fractal. 

But there are two other topics worth pausing over. One is obvious from the mention of 
computer graphics: the use of fractals to model naturally occurring objects like mountains, 
rocks, trees and leaves. This merits a separate discussion; cf. Section 15. 3[ 

The other topic is an analogue for fractals of the quantifier-shift that Section 14.2.11 
discussed as underlying the "two sides of the coin" in (l:Deduce). (This topic is also 
connected to the robustness requirement in my notion of emergence; but I will not pursue 
this.) 

Thus take a traditional geometrical variable magnitude: in philosophers' jargon, a 
determinable property of a geometrical figure F. For example, consider 'contains a con- 
tinuous arc of length greater than L' (variable L). And suppose we have an repeatable 
definitional process, that at its Mth iteration defines a figure (subset of H"), Fm, and 
that introduces successively finer structure so that for each value L of the variable, Fm 
lacks the property for sufficiently large M. That is: the property is lost after sufficiently 
many iterations. Or to put it more positively: an approximate or coarse-grained version 
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of a fractal-like property is gained. For example, the definitional process might imply: 
VL > 0, 3A^, VM > A^: the figure Fm lacks arcs of length greater than L. But for smaller 
L, more iterations will be needed. 

To make an analogy with Section [4.2. If s quantifier-shift, we now develop this idea so 
as to both: 

(a) use an 'resolution' e, as is usual in definitions of convergence; 

(b) make a "pointwise vs. uniform" contrast, by quantifying over some set Q of 
geometrical properties, or sub-figures, of the figure Fm- 

Thus suppose that in the figure Fm at stage M, the only, or the largest, example of a 
property or sub-figure G e ^ is of size (say, length) L. I will write this as: Size(FM, G) = e. 
Then the successive loss of properties G G Q — more exactly: the loss of visible, large- 
enough-to-be-seen, G G Q — by a sequence of figures {Fm} can happen: either pointwise 
across Q, viz. 

Ve, yG eg,3NyM > N : Size(FAf, G) < e; 
or uniformly across Q, viz. 

Ve, ^N^G eg^WM > N : Size(FA/, G) < e. 
Besides, there are alternatives to using such a set Q so as to make the pointwise/uniform 
contrast. We could instead use different parts of the figures Fm- Thus one can imagine 
the stages of the definitional process to proceed at different "rates" in different regions: 
in different thirds, ninths,..., of the Cantor set; or sides, sub-sides, sub-sub-sides,..., of 
the Koch snowflake. If the rates vary in a suitably ever-slower way, across a denumerable 
sequence of sub-regions, one would get convergence to the fractal structure that is merely 
pointwise across the set. 

5.2.3 Supervenience is a red herring 

I shall be very brief about my third claim, (3:Herring): that although various superve- 
nience theses are true, they yield little or no insight into emergence, or more generally, into 
"what is going on" in the example. For the situation is again like that for the method of 
arbitrary functions (Section I4.2.3P : my claim holds true, essentially because supervenience 
makes no connection with the main ideas of the example — self-similarity and dimension 
as an exponent. 

For any finite N, the property of interest, dimension, of the object concerned, i.e. of 
a subset X C IR", "supervenes on how X is constituted from points" — in at least two 
obvious senses of this phrase. Namely: (i) the trivially strong sense in which only X itself 
contains those very points (cf. set-theory's axiom of extensionality) ; (ii) the marginally 
weaker sense in which as regards its constitution from points, X matches any congruent 
or scaled copy of X. And since in this example, there are infinite systems cr(oo), i.e. the 
"irregular" sets C,K G IR" etc., the same goes for N = oo. That is: the dimension of 
these sets, in any of the several senses of dimension, thus supervenes. 

But such supervenience theses are trivial and useless, for the two now-familiar reasons, 
(a): They provide no control on the infinity (infinite disjunction) they are concerned with, 
because no kind of limit is taken, (b): Their infinity makes no connection with the limit. 
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N ^ oo, that the example is concerned with. In particular, the supervenience thesis gives 
no hint that we can use the idea of dimension as an exponent so as to define non-integral 
dimensions. 

5.3 The fractal geometry of nature? 

So far, the pure mathematics of dimension has dominated the discussion. But fractals have 
many empirical applications. As I discussed in Section [5. 1.2I A. countless empirical studies 
have found power law behaviour with a dimension as a non- integral exponent: recall the 
examples of the coastline and electrodeposited copper. And Section 15.2.21 mentioned 
computer graphics' use of fractals to model objects like mountains, trees and leaves. This 
representational power of fractals is remarkable, indeed amazing@ Thus fractals have 
been hailed as revealing the true geometry of nature, e.g. by Mandelbrot (1982). But this 
claim has been disputed (Shenker 1994, especially Sections 3-5; Smith 1998, pp. 31-38): 
hence this Subsection's title. 

I will argue that with my claims (2:Before) and (4:Unreal), we can put this controversy 
to rest. I will distinguish two senses of the phrase 'geometry of nature', and propose that 
fractal geometry is a geometry of nature, in the second sense but not the first. It will be 
clear that (2:Before) corresponds to the second sense, while (4:Unreal) corresponds to the 
first. Finally, I will introduce an "abstract", rather than "natural history", sense of the 
phrase. In this last sense, fractal geometry is again a geometry of nature; and this again 
corresponds to (2:Before). 

Suppose first that 'geometry of nature' means 'the completely accurate description of 
the shapes and sizes of macroscopic objects'. Then it sure looks like fractal geometry is 
the geometry of nature — as many a film with computer-generated graphics attests. But 
authors such as Shenker have objected that a fractal has an infinite sequence of intricate 
but similar structure on ever smaller length scales; while a mountain, rock, tree, fern and 
leaf do not, thanks to their atomic constitution. This objection is of course correct: recall 
my claim (4:Unreal) of Section |2l So despite initial appearances, fractal geometry is not 
in this sense the geometry of nature. 

Indeed, the objection can be sharpened, in two ways: one theoretical, one practical. 
(Neither seems to have been noticed in the literature.) I touched on the theoretical 
sharpening, already in footnote [THl when I noted that while Euclidean geometry admits 
the similarity of triangles and other figures, on which self-similarity and so fractals depend, 
non-Euclidean geometries do not. This means that if physical space is in fact slightly non- 
Euclidean on even the tiniest scales, as general relativity and cosmology nowadays say, 
then macroscopic objects could not be exactly fractal — even if atomism was false and they 

^°And noticed by the wider culture: in Stoppard's play Arcadia (1993), the hero Valentine describes 
a stage-by-stage computer-simulation: 'If you knew the algorithm and fed it back say ten thousand 
times, each time there'd be a dot somewhere on the screen. You'd never know where to expect the next 
dot. But gradually you'd start to see this shape, because every dot would be inside the shape of this 
leaf. It wouldn't be a leaf, it would be a mathematical object.' In another passage he is lyrical about 
fractals' representation of other 'ordinary-sized stuff which is our lives, the things people write poetry 
about — clouds, daffodils, waterfalls'. 
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were instead composed of continuous matter, even on arbitrarily small length scales. So 
here again, we meet my claim (4:Unreal)@ 

The practical sharpening concerns the details of Section l5.1.2[ A's empirical studies 
of power laws with a quantity / proportional to a non-integral power of a resolution S: 
f = constant x 6^. Suppose that faced with such a study, we ask: how many orders 
of magnitude of S does the data report — or does the analysis in fact probe? The answer 
can be: disappointingly few. A survey of ninety-six Physical Review articles (in the years 
1990-1996) reporting fractal analysis of data found that among these articles: (i) the 
average spread of resolutions that were probed was 1.3 orders of magnitude; and (ii) at 
most three orders of magnitude were probed (Avnir et al. 1998). In terms of measuring 
the length of a coastline: an "average paper" in the set surveyed by Avnir et al. would 
describe the coastline or its length as 'fractal', though the authors considered a spread of 
resolutions that went only from some length L to about thirteen times L, ^ 13L. And 
even the papers that were most stringent, or cautious, in describing their phenomenon as 
'fractal' probed their resolution only up to three orders of magnitude!^ 

To sum up about this first sense of 'geometry of nature': if we ask the question. 

Do fractals describe, with complete accuracy, the shapes and sizes of naturally 
occurring macroscopic objects? 

we have to answer 'No'. 

But despite this answer 'No', the representational power of fractals remains very strik- 
ing. Power laws with a non-integral exponent describe very many phenomena; and our 
understanding of the phenomenon is often enhanced, empirically as well as theoretically, 
by adding to the bare power law, the suggestive language and exact theorems of fractal 
geometry. Here again we see that in a suitably weak sense, emergence can occur before 
the relevant limit: (2:Before) again! 

Besides, this is consistent with (4:11 ureal), since (2:Before) applies to values of the 
parameter which are typically much smaller than those making true (4:Unreal). That 
is: our 'No' answer turned upon our question's requiring complete accuracy. If instead we 
ask, in the context of modelling some specific phenomenon involving naturally occurring 
macroscopic objects, 'Do fractals describe, with sufficient accuracy, the shapes and sizes 
of these objects?', our answer would very often be 'Yes'. In this weaker sense, fractal 
geometry undoubtedly is a geometry of nature. 

There is another aspect to this resolution of the controversy; (which, like the foregoing, 
should not be controversial!). So far I have considered the shapes and sizes of macroscopic 



^^I stress the phrases 'nowadays say' and 'macroscopic objects'. I of course agree that, for all we know, 
fractals may be involved as fundamental structures in the ultimate theory, at present unknown, of matter 
and-or space. But that is not our concern. 

^^Thanks to Leo Kadanoff for commenting that, happily, the range probed can also be much larger. 
He mentions the work of Libchaber and co-authors on turbulence, and Nagel and co-authors on glassy 
behaviour. Indeed, the former have probed five orders of magnitude (e.g. Castaing et al. 1989), and the 
latter have probed thirteen (Dixon et al. 1990). I presume that the latter group's Physical Review papers 
have been omitted from Avnir et al's survey for the ironic reason that they meritoriously avoid using the 
word 'fractal'. 
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objects in physical space. But suppose we allow that 'geometry' applies to objects or 
structures in other spaces: in physical theories' state-spaces, which indeed often include 
objects and structures the theory calls 'geometric'. Some of these theories are strikingly 
successful, in the depth and accuracy of their theoretical descriptions and observational 
predictions. So some of these postulated geometric descriptions surely deserve to be called 
(a) 'geometry of nature': i.e. a geometry of an object or structure in a physically real 
space, albeit a more abstract (and often less visualizable) space than physical space. So 
if we ask instead a third question. 

Do some of our successful physical theories use fractals to describe certain sub- 
sets of their abstract spaces, in particular attributing a non-integer dimension 
to such objects? 

the answer is again: Yes. 

Here are just two examples, out of many one could cite. One can check that in each of 
them, the justification for using fractals, i.e. for the N ^ oo limit, is the Straightforward 
Justification of Section IB. 3. 3^ with its two obvious reasons, mathematical convenience and 
empirical success. 

(1) : In classical mechanics, there are physically important fractal subsets of the phase 
spaces of systems. In particular, the famous Lorenz and Henon attractors have fractal 
dimension. (The philosophical literature on chaos theory has discussed these, along the 
lines of the Straightforward Justification; e.g. Smith's (1998, pp. 41-43, 50-56) discussion 
of the Lorenz attractor.) 

(2) : Statistical mechanics describes some aspects (viz. critical points) of some pro- 
cesses (phase transitions, like boiling and freezing) with scale-free (regimes of) theories, 
which involve power-law behaviour on all scales and self-similarity, and therefore fractals. 
Section [7] will give more details. In particular. Section 17.2.21 will discuss the phenomenon 
of cross-over: in which, as the parameter N grows, the system's behaviour crosses over 
from illustrating (2:Before), at intermediate values of N, to illustrating (4:11 ureal), at 
larger values of N, to again illustrating (2:Before) at yet larger values of A^. Thus cross- 
over will be a vivid illustration of my swings-and-roundabouts, (2:Before)-and-(4:Unreal), 
Yes-and-No, answer to the question 'Is fractal geometry the geometry of nature?' 

5.4 The story so far: summing up fractals 

Let me sum up the fractals example as a list of six morals. It will be obvious, without 
making explicit my four claims, or Section I3.3.3[ s Straightforward Justification, or the 
parallels with the method of arbitrary functions, that this list also sums up the whole 
story so far. 

(i) : The large finite is often well-modelled by the infinite. 

(ii) : Such models are often justified in a straightforward, even obvious, way, by math- 
ematical convenience and empirical success. 

(iii) : The infinite often brings new mathematical structure: in the fractals example, 
non-integer dimension. 
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(iv) : Nevertheless, there is often a reduction: the emergent non- integer dimensions are 
reducible to a sufficiently rich theory that takes the infinite limit. 

(v) : On the other hand, one can often see emergent behaviour on the way to the limit. 
Thus in the fractals example: the larger i.e. worse the spatial resolution — the worse your 
eyesight — the sooner in the iterative definitional process you see (more precisely: think 
you see!) the fractal structure. 

(vi) : Various supervenience theses hold — but they are trivial, or at least scientifically 
useless. 

6 Superselection 
6.1 Introduction 

I turn to the first of my two examples from physics proper: superselection in the — )■ oo 
limit of quantum mechanics. As in the previous two examples, I will first expound the 
technical details without reference to my claims (Section 16. 2p . and then illustrate the 
claims (Section 16. 3p . But to make those illustrations richer, I will give more details 
than previously. So the aim of this Subsection is to describe the lie of the land — and so 
indicate which details matter more. I will do this by describing (i) the general topic of 
the quantum-classical transition (Section I6.1.ip and (ii) the basic idea of superselection 
in the limit (Section I6.1.2p . 

6.1.1 Out of the quantum soup 

This example is an aspect of a much larger topic: the emergence of the classical world 
from the quantum world. This is often discussed in terms of a limit ^ — )• 0. But the topic 
involves a lot else, such as proposals for the importance of specific states (e.g. coherent 
states), and-or of specific physical processes (e.g. decoherence). My example is just one 
case of the general idea that classical physics should emerge for "large" quantum systems: 
so in this example of my theme that N ^ oo, N will be the number of degrees of freedom. 
(We will see how this iV — )■ oo relates to /I — )■ 0. But as I announced at the end of Section 
m I will not in this Section pursue my fourth claim (4:11 ureal).) 

By way of glimpsing the larger topic, I note that there are now many well-understood 
examples of classical physics emerging for large (i.e. — i- oo) quantum systems. Much of 
this work uses the algebraic approach to quantum theory, in which systems are primarily 
described by algebras of quantities, on which the states are linear expectation function- 
al; and I shall follow suit. For example, Sewell's recent monograph uses this approach to 
articulate a 'rather general scheme for ... deriving the irreversible deterministic macro- 
scopic dynamical laws of many-particle systems, such as those of hydrodynamics or heat 
conduction, from their underlying quantum dynamics' (2002, p. 87). This scheme forms 
a girder across the rest of Sewell's book: it is realised in detail, in several examples!^ 

^■^The first is a toy-model reminiscent of quantum Brownian motion: a massive particle at one end of 
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To illustrate my claims, I can treat superselection more simply than by following 
Sewell's scheme: the main simplification will lie in ignoring dynamics (especially, the 
deduction of classical equations of motion) and aiming only to deduce, as iV — )■ oo, 
classical kinematics (more precisely, commutativity of quantities). But to give a general 
perspective, it is worth first quoting Sewell's scheme. (These details are not needed later.) 

Sewell's macroscopic picture is given by a classical dynamical system Ai = {y, T) 
where 3^ is a topological space, and {T(t) \ t G 1R+} is a one-parameter semigroup of 
transformations of 3^. Ai is to correspond to the dynamics of a one-parameter family 
of finite sets of quantities of a quantum system E; where S's evolution will be given by 
a one-parameter group a(lR) of automorphisms of S's algebra of quantities. So we write 

Fo = {r«,...,y('=)}. 

Here f2 is a "large", dimensionless, positive parameter whose magnitude provides a 
measure of the quantities' macroscopicality. We then require that there is a set A of 
states of E such that: 

(a) : For each state G A, the means and dispersions of 1^ converge to limits, 
y(0) =Y and 0, respectively, as — )■ oo. 

(b) : As (f) runs through A, the resultant range of the limiting values Y = Y{(f)) is 
just the classical phase space 3^. (So 3^ is a subset of IR'^.) 

(c) The classical dynamical semigroup T(1R+) of Ai is induced by the evolution of S 
from states in A, on a "macroscopic" time scale Q"^, with 7 > 0. To be precise: we require 
that the mean and dispersion, for a state G A, of the k macroscopically time-evolved 
quantities a{Q'^'t)YQ should converge to T(t)Y and 0, respectively, as Q 00. 

Accordingly, Section W2\ will give analogues of Sewell's kinematic (a) and (b), though 
not of his dynamical (c). But in two other respects, I will go beyond the above scheme. 
First, I will emphasise that there are two different infinite limits to be considered: 

(i) the classical limit of ever-larger (increasing N) quantum systems, which will have 
a// quantities commuting, and which corresponds to Sewell's (a) and (b); and 

(ii) the quantum limit of ever-larger quantum systems, which will exhibit superselec- 
tion, i.e. some quantities apart from multiples of the identity operator commuting with 
all quantities: in algebraic terms, the centre of the algebra of quantities being non-trivial. 

Second, I will discuss these limits in terms of deformation quantization: which has the 
merits not only of generality and precision, but also of showing how (both of) these limits 
are continuous — pace the frequent emphasis on the singularity of /?.—)■ 0, and philoso- 
phers' frequent emphasis on discontinuous limits as a signature of emergence. (Recall 
the discussions in Sections 11.11 and 13.31 ) In both these respects, my treatment follows 
(but simplifies) some material in Landsman's masterly account of the quantum-classical 
transition (2006). 

a semi-infinite chain of much lighter particles, with harmonic nearest-neighbour interactions; p. 94-106. 
His Chapters 7, 10 and 11 describe much more advanced cases, e.g. lasers. 
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6.1.2 The idea of superselection in the limit 

Setting aside all details of both mathematics and physics, I will state in a nutshell the 
idea of superselection emerging in the limit — )■ oo, by using the idea that a sequence of 
numbers, each less than one, has in general an infinite product equal to zero. Thinking 
of each number as an inner product of two vectors allows us to think of this zero as 
orthogonality: a hallmark of superselection. 

Imagine that we assign two real vectors G H^, each of unit-length, with angle 6 between 
them, the "score" cos 6'. So if they are parallel, the score is cosO = 1; but if they are not 
parallel, it is less than 1. And suppose we assign two sequences of unit vectors, each with 
N members, a score that is the product of the cosines of the angles between corresponding 
members: 

SCOie{<Vi,V2., ....Vn > < Ui,U2, ...,un >) ■= (6.1) 

cos Ou-^y^-^ COS0y2U2'" COS Oy^u^ 

We now let tend to infinity, and consider the limiting values of the score we have 
defined. We note two sorts of case: — 

(1) : A pair of infinite sequences < Vi >,< Ui > in which the vectors at corresponding 
positions i are not parallel, only for finitely many i. Then only finitely many factors in 
the score will be different from 1; infinitely many factors will be 1. So the total infinite 
product of numbers is a product of finitely many cosines each less than 1. This is some 
real number of modulus less than 1. It might be zero: namely if at least one pair of vector 
Vi,Ui are perpendicular. 

(2) : A pair of infinite sequences < Vi >,< Ui > in which the vectors at corresponding 
i are not parallel, for infinitely many i. So the total infinite product of numbers is a 
product including infinitely many numbers that are each less than 1. In general, this 
infinite product is zero (and even if there are also infinitely many factors each equal to 

!)• 

These elementary considerations underly the emergence of superselection in the A^ — t- 
oo limit of quantum mechanics. (1) corresponds to two quantum states (two infinite 
sequences of unit-vectors) being in the same superselection sector. And (2) corresponds 
to two quantum states being in different superselection sectors. 

Now I enter in to details. As in my previous two examples, I first expound the 
technicalities without reference to my claims (Section 16. 2p . and then illustrate the claims 
(Section 16. 3p . 

6.2 Superselection in the — > oo limit of quantum mechanics 

In Section [6.2.1t I describe our prototype systems: finite and infinite spin chains. This will 
already exhibit the idea just expounded, of a "score" of sequences of vectors that converges 
to zero. In Section I6.2.2[ I introduce the ideas of deformation quantization (especially 
continuous fields of algebras of quantities), in terms of which our N ^ oo limits will be 
continuous. In Section I6.2.3[ I treat the classical infinite- A^ limit of quantum systems: 
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for spin chains, this means identifying macroscopic quantities with averages of quantum 
observables, with the average being taken over greater and greater segments of the chain. 
In Section I6.2.4[ I treat the quantum infinite- hmit: here the hmiting quantities are 
local in the sense that they "ignore" all but a finite part of the system. In Section I6.2.5[ 
I discuss classical states of the quantum-infinite, and thus connect with superselection. 
NB: Each Section begins with a brief statement and then, after the announcement '/n 
Detail\ gives details — which the reader in a hurry can skip. 

For all this material, and much more, I recommend Landsman (2006, especially Sec- 
tions 4.3, 6.1-6.4), who also gives many references (and whose notation I adopt). Com- 
pared with Landsman, I have chosen to down-play infinite tensor products and the repre- 
sentation of algebras, and to emphasise deformation quantization. This avoids repeating 
material which is well-known in the philosophy of physics literature; and more important, 
gives what we need for Section I6.3f s illustrations of my claims. 

6.2.1 Spin chains 

I begin with three claims about spin chains, which serve as ideal infinite models of ferro- 
magnets. I shall take a doubly infinite spin-half chain, with sites labelled by the integers 
Z. But it will be clear that similar claims would hold for higher spin, and for a one- 
dimensional half-infinite chain (sites labelled by N) or for a two- or three-dimensional spin 
lattice (Z^ or Z^). In these claims, we eschew infinite tensor products and non-separable 
Hilbert spaces. Rather we define a continuous family of separable Hilbert spaces, each of 
which will later turn out to be a superselection sector. 

The overall physical idea is that: 

(a) The vacuum i.e. ground state has all the spins aligned in one spatial direction, 
with other (higher-energy) states built up by flipping a finite number of spins from the 
preferred direction (and superposing); yielding a separable Hilbert space representing the 
spin-algebra. 

(b) But nature does not prefer one such direction. So for any direction, there is a 
vacuum state with the spins thus aligned; and the higher-energy states built up from this 
vacuum yield an associated Hilbert space. 

(c) These Hilbert space representations differ in a global/macroscopic quantity 
(spin density). 

(d) The representations are thereby unitarily inequivalent — even though intuitively, 
a global rotation (an element of SO (3)) of course rotates one vacuum to another. 

In Detail: One shows four claims, e.g. for the one-dimensional doubly infinite spin 
chain; (cf. e.g. Sewell 1986: 16-18, or 2002: 15-18). 

(a): For each direction (where positive and negative z-directions count as two 
directions), there is an irreducible representation of the infinite spin algebra generated 
by denumerably many pairwise-commuting copies of the trio of the Pauli matrices, i.e. 
generated by matrices: {an := {(Jn,x, o'n,y, o'n,z) I = 0, ±1, ±2, . . . }. 

Let S be the set of doubly infinite (-1-1, — l)-sequences, s = (s„)„gz, s„ = ±1. We 
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think of this as the set of "classical" configurations of the eigenvalues ±1 of at each 

site. Let 5**^+) C 5 be those configurations s with all but finitely many s„ = +1. We think 

of these as local modifications of the z-up (classical) vacuum = (.., 1, ...). Let "H^"*"^ 
be the square-summable functions on 5'^+^: 

?^(+) - {0 : SM ^ C I E,^5(+)|0(s)|^ < oo} , (6.2) 

with inner product 

(0,V')(+):=S,,W(s)V'(s). (6.3) 

The vectors (j)i^\s G S^+^ defined by 4+\s') = 6s,s', s,s' e S^+\ are m one-one corre- 
spondence with the configurations s. They form an orthonormal basis of H^'^\ 

We can now define operators whose action on the basis vectors ^s"^^ is the analogue 
of the action of the three Pauli matrices on a single spin. That is, we define {aii^u \ n e 
Z, u = X, y, z} on 'H'-"'"^ in the obvious way, so as to build a representation on T-L^~^^ of the 
abstract spin-algebra: 

Wn,x, CTnJ = SicT^^^ ctc. ; [am,x, CTnJ ^ ioT m ^ u etc. (6.4) 

The representation is irreducible since we can pass from any basis vector ^s"*"^ to any other 
(jy^g!'^ by a sequence of operators (xl^^ 

Note that because we fixed on the denumerable S^~^^ C S, not the continuously 
large S, we got a separable Hilbert space T-L^'^\ So this representation does not requires 
us to make sense of a denumerable tensor product (of copies of C^). 

(b) : We can "play the same game", starting with 2;-down. That is: let S^'''> C 5" be 
those configurations s with all but finitely many s„ = —1. We think of these as local 
modifications of the z-down (classical) vacuum. We get a representation on 

Of course, the choice of z was arbitrary. So we can build a representation for any 
direction. And each such representation: (a) takes the ground state to have all spins 
aligned along the direction; and (b) builds elements of the representation as all linear 
combinations of product states obtained from the ground state by a finite number of 
unitary transformations (spin-fiips or rotations) at individual sites. 

(c) : For each such representation, every state in it (indeed every density matrix on it) 

has a common value of a global quantity, (aka: classical or macroscopic quantity): namely, 
the (vector) spin density defined as the limit as iV — )■ cxd of the average of the spin matrices 
at the sites -N, -N + 1, . . . , N - 1, N , i.e. \imN^oo2i^'^n=-N^n- 

In more detail: On our first space T-L^~^\ define 

so that the expectation value on basis states is 

m(,+)0(+)) = (o, 0, ^^St^-^«n) . (6.6) 
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Since all but a finite number of s„ are +1, 16.61 implies, with k the unit vector along Oz: 

lim (0W,m5^V^^) = k ,Vs e 5^+^ . (6.7) 



N^oo 

Similarly, one has: 



lim ((/.W,m5^Vi' ) = , for s ^ s' : (6.8) 

for in the z-component, m}^^ gives no spin- flip or rotation, so that the orthogonality, 
{(j)^s^\(f)^J'^) = 0, yields 0; while in the x and y components, the eventual agreement of s 
and s' in having thereafter always the value +1 means that mj^^'s spin-flips and rotations 
give inner product zero at those sites, while the increasing 2N + 1 denominator kills any 
initial non-zero contribution got from 's action. Thus from 16.61 and 16.81 it follows 
that for any unit vector G 'H^^\ 

lim (^W,mS^^^W) = k . (6.9) 

So this limiting spin density is a global property of the representation. And the represen- 
tations built from other vacua will have different unit vectors in as their limiting spin 
densities. 

As we will discuss in more detail: these are the system's superselection sectors. There 
are continuously many sectors, each of denumerable dimension. 

(d): Each such representation is unitarily inequivalent to every other. For example: if 
the representations on "H*^"*"^ and "H*^^-* were unitarily equivalent, there would be a unitary 
U : 7{(+) ^ H^-^ with Ua^n^U^^ = This would imply that f/mj^^?/-^ = m-^\ This 
in turn would imply that, for any unit vectors G H^^-* with \1''^+) = f/^^vI/(^) 

{¥+\m^j^^¥+^) = {¥-\m^j^^¥'^) : (6.10) 

but this must be false, since the two sides have different limits, viz. ±k, as N oo. 



6.2.2 Continuous fields of algebras and deformation quantization 

The main idea we need from the modern theory of deformation quantization (in the 
C*-algebraic approach) is the idea of a continuous field of algebras of quantities. For 
Sections 16.2.31 and 16.2.41 will define such fields in such a way that their N ^ oo limits 
are continuous. But when I enter details, I will also define two other central ideas of the 
theory. 

The idea is that a continuous field of algebras of quantities is like a bundle in differential 
geometry. The base space is a set I 3 h of real numbers, and the fibre above each point 
h E I is a. C*-algebra An representing the system's quantities for that value of h. (Of 
course, this is not meant to suggest that the value of h really varies. As usual, this is 
shorthand, essentially for the ratio of h to the typical values of the system's, or problem's, 
actions.) The topology of the bundle is defined indirectly by specifying what are its 
continuous sections. As we will see: the value h = can correspond to a — > oo limit; 
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and we can choose the algebras (and the definitions of continuous sections) so as to make 
this either a classical limit or a quantum limit. 

More formally: a continuous field of C*-algebras over I 3 h consists of a C*-algebra A, 
and a collection of C*-algebras {Ah}hGh subject to certain conditions which imply that: 

(i) : the family {An)hei of C*-algebras is glued together by specifying the space of 
continuous sections of the bundle U/jg/^s (where ]J indicates disjoint union); 

(ii) : the C*-algebra A can be identified with this space of sections. 

In Detail: A continuous field of C*-algebras over I 3 h consists of a C*-algebra A, a 
collection of C*-algebras {An}h€i, and a surjective morphism ipfi : A ^ An for each h E I 
such that: 

(i) : the function h t-> ||v9;-,(yl)||rj is in Co{I) for all A e A^ 

(ii) : the norm of any A E A is \\A\\ = sup^^zi \\(pjiiA)\\; 

(iii) : for any / G Cq{I) and A E A there is an element fA G A for which (pji{fA) = 
f{h)ipn{A) for all he I. 

The idea is that the C*-algebras {Ahjhei are glued together by a topology on the disjoint 
union Y[he[o i] -^i- -^^^ ^^^^ topology is defined indirectly, by specifying the space of 
continuous sections of the "bundle" . Namely, a continuous section of the field is defined 
to be a map h )■ An. where An G An for which there is an A G ^ such that An = ipn{A) 
for all h E I. So the C*-algebra A may actually be identified with the space of continuous 
sections of the field: if we do so, the morphism ipn is just the evaluation map at h. 

With this idea of a continuous field of algebras, we can define a deformation quantiza- 
tion. We begin with a classical phase space M, on which the continuous complex functions 
/ represent quantities (through their real and imaginary parts). We want to define, for 
each value of h, a quantization map Qn mapping such functions / to elements of An, sub- 
ject to various conditions — in particular, the "Dirac" condition that the Poisson bracket 
on M is mapped to the quantum mechanical commutator (times |). Two technical com- 
ments: (i) M can be a Poisson manifold — a mild generalization of the usual symplectic 
manifold; (ii) on the other hand, we take / to be smooth with compact support — the 
space C^(M) of these functions is a norm-dense sub-algebra of Co(M). 

Thus we define: a deformation quantization of a phase space M is a continuous field 
of C*-algebras {An)n<^[Q,i] (with = Cq{M)), along with a family of linear maps Qn : 
C^{M) An, h G (0, 1], that are self-adjoint (i.e. QniJ) = Qnif)*), and such that: 

(i) for each / G C^{M) the map defined by ^-^ / and h h-> Qn{f) {h 7^ 0) is a 
continuous section of the given continuous field of algebras; 

(ii) for all f,g e C^{M) one has the "Dirac" condition 



lim 



lmf),QH{9)]~Qni{f,9}) 



0. (6.11) 



^^Here || \\n is the norm in An, and Co{I) is the continuous complex functions on / that vanish at 
infinity in the usual sense that for any e > 0, there is a compact set beyond which the function is less 
than e. For any locally compact Hausdorff space X, Co{X) is a C*-algebra when equipped with the 
supremum norm, := swpxex\f{x)\. In any case, for our examples / is itself compact. 
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This definition turns out to imply other natural "meshing" conditions, such as: 

^m \\Q,{f)Qn{g)-Qn{fg)\\, = ; lim || Q,(/) ||, = (6.12) 

where is the supremum norm sup zeM\f{z)\ on = Co{M). 

By way of indicating the power of deformation quantization, let me sketch how a 
definition which it enables, viz. of a continuous field of states (treated as linear functionals 
on quantities), gives a natural generalization of the notion of coherent states — which are 
the focus of many discussions of the h ^ limit of quantum mechanics. (The details of 
Sections 16.2.51 and 16.3.11 will also need this definition.) 

Given a state Ufi on Afi for each h G [0, 1]: we define the family of states to be a 
continuous field (relative to a given deformation quantization), whenever the function 
h (-7- (jJhiQhU)) is continuous on [0, 1] for each / G C^(M). (In fact, this notion is 
intrinsically defined by the continuous field of C*-algebras, and is therefore independent 
of the quantization maps Qn.) In particular, one has 

lima;,(Q,(/)) = a;o(/). (6.13) 



Now recall the idea of a sequence of coherent states (\l/^)rig[o,i] that tend, as — )■ 0, to 
be ever more peaked about the classical phase space state z G M, so that in the limit, 
the quantum expectation value tends to the value of the classical quantity at the state z: 

\iM^lQn[ml) = f{z). (6.14) 

(This of course exemplifies (a) and (b) in Sewell's scheme, reported in Section 16.1.11 the 
coherent states are Sewell's set A of quantum states, h is the reciprocal of his fi, and 
M is his phase space y.) Clearly, eq. 16.131 generalizes eq. 16.141 Indeed, one can show 
that coherent states are examples of continuous fields of (pure) states; (for details, cf. 
Landsman (2006, Section 4.2, 5.1). 



6.2.3 The classical infinite: macroscopic quantities from symmetric sequences 

I turn to the classical N ^ oo limit of copies of a quantum system that has an algebra 
of quantities Ai. The idea is to identify a classical, macroscopic quantity with a limiting 
average of a microscopic quantum quantity. Thus the microscopic quantity is defined 
over say M copies of the system; (so it is often called an "M-particle observable"). But 
this quantity is averaged over N copies (A^ > M), and then we take the limit — )■ oo. 
For spin chains (Section I6.2.ip . this means the average is taken over greater and greater 
segments of the chain. 

This is made precise in terms of a continuous field of C*-algebras A^'^^ over different 
values of A^. The four main points are: — 

(i): To conform to Section r6.2.2f s notation for h 0, we in fact use the reciprocal 
of A^, 1/A^, rather than A^. Thus A[,j^c) will be the usual algebra of quantities for A^ 
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copies of the basic quantum system, viz. (g) Ai, i.e. the A'^-fold tensor power of the basic 
algebra Ai- 

(ii) : We associate a macroscopic quantity with a sequence A = {Ai,A2, ■ • ■ ) of algebra- 
elements, with e ® Ai =: Ai , that is symmetric in the sense that its tail consists 
of some finite-particle quantity averaged over an ever-larger number of systems or sites. 
That is: a sequence is symmetric if each of its elements Ap for all P greater than some 
fixed N consists of some quantity on M (with M < N) copies of the system (an "M- 
particle observable") averaged over P copies. 

(iii) : In the limit N oo, symmetric sequences commute; and this will mean that the 
macroscopic quantities form a commutative C*-algebra. In fact this algebra is isomorphic 
to C{S{Ai)), i.e. the continuous complex functions on the quantum state-space for (the 
density matrices on) the basic algebra 

(iv) : The important features (both here and in subsequent Sections) are present in 
the construction for the simplest basic algebra Ai := M2(C), i.e. a spin chain with a spin- 
half system at each site. So we can throughout the discussion keep this system in mind. 
For example, 5(M2(C)) is the Bloch sphere in R^, with pure states on the boundary 
dB^ = 3"^. So according to (iii), the macroscopic quantities for a chain of spin-halves are 
given by the continuous complex functions on the Bloch sphere. 

In Detail: From Ai = say M2(C), we will construct a continuous field of C*-algebras 
over the discrete set 

7 = 0Ul/N={0,...,l/7V,...,i,i,i}c [0,1], (6.15) 

by putting 

4^):=C(5(A)); 

Af)^ := < := ^""Ai. (6.16) 

Thus if Ai = M2((D), then S{M2{€)) is the Bloch sphere B^ in and A^q^ is the set of 
continuous functions on B^. 

To define symmetric sequences, we first say that the symmetrization operator Sn '■ 
Ai — >■ Ai is given by (linear and continuous) extension of 

SNiBi ® . . . ® Sat) := ^ • • • ® S^(jv), (6.17) 

where ©at is the permutation group on N elements and Bi G Ai for all i = 1, . . . , A'^. 
Then we define Symmetrization maps Jnm '■ ^^i^ by 

jNMiAM)^SNiAM^l^---®l)] (6.18) 

with N — M copies of 1 e so as to get an element of Ai . For example, jm '■ Ai 
is given by 

— 1 ^ 

jm{B) = b'-''^ = ]^ I] 1 ® • • • <^ 1 • • • 1, (6.19) 

fe=i 
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where -B(fc) is B as an element of the fc'th copy of within . As B^^"^ indicates, this is 
the 'average' of B over all copies of Ai. More generally, in forming jNAii^M) an operator 
Am G Af^ that involves M sites is averaged over N > M sites. When N oo this means 
that one forms a macroscopic average of an M-particle operator. 

A sequence A = {Ai, ^2, ■ ■ ■ ) of algebra elements with An ^ Ai is symmetric when 

An = 3nm{Am) (6.20) 

for some fixed M and all > M. So the tail of a symmetric sequence consists of 'averaged' 
quantities, which become macroscopic in the limit A^ — > oo. The important point is that 
symmetric sequences commute in this limit; more precisely 

lim \\AnA'^ - A'j^AnW = 0. (6.21) 

N^co 

The averaging of 1-particle spin operators provides a clear example, and illustrates how 
A^ — 7- oo corresponds to /i — )■ 0. Thus let An := Jni^B) and := jNi{C) with 
B,C E Ai. Then the fact that [B(^k), C{i)] = for k ^ I implies that 



j^iB^Cy '. (6.22) 



i—tjkiSi . (6.23) 



For example, if Ai = M2((C) and if for B and C one takes the spin-i operators Sj = |crj 
for j = 1, 2, 3 (cTj the Pauli matrices), then 

h —(AT) 

Thus we get commutation relations formally like those of the one-particle operators, except 
that Planck's constant h is replaced by h/N. 

We are now ready to define our continuous field of algebras. A section of the field 
with fibers 16.161 is a sequence A = {Aq, Ai, A2, ■ ■ with Aq G A^q^ and An E Ai . We 
say that a sequence A defines a continuous section of the field iff: 

• {Ai,A2, ■ ■ ■) is approximately symmetric; i.e. for any e > there is an A^^ and a 
symmetric sequence A' such that H^tv — A'j^\\ < e for all N > N^; 

• Aq{ijj) = limjv^oo i^^(^Ar), where cu G S{Ai) and o;^ G S{Ai) is the tensor product 
of A^ copies of u, i.e. 

u^iBi (g)-..®BN)= uj{Bi) ■ ■ -uiBN). (6.24) 

This choice of continuous sections defines a continuous field of C*-algebras over / = OUl/N 
with fibers 16.161 In fact it follows that 

lim \\An\\ = \\Ao\\ . (6.25) 

N^oo 

To sum up: the main point is that, in accordance with 16. 2H the macroscopic quan- 
tities are organized in the limit A^ — )■ 00 in to a commutative C*-algebra isomorphic to 
C(5(A)). 
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6.2.4 The quantum infinite: quasi-local sequences 



I now treat the quantum N ^ oo limit of copies of the algebra Ai. The key idea 
will be that of a quasilocal sequence {Ai, ^2, ■ ■ ■ ) of algebra elements, with An E Ai . In 
constructing the continuous field of C*-algebras, this notion will play an analogous role 
to that played in Section 16.2.31 by symmetric sequences. The idea is that the tail of a 
quasilocal sequence becomes arbitrarily close (in norm) to an element Am ® 1 ® 1 ® • ■ • , so 
that the sequence "ignores, in the limit" all but a finite part of the system. The intuition 
behind this restriction is that human limitation means we can observe only a finite part 
of the system. In any case, the restriction allows us to make precise the heuristic idea of 
the algebra for an infinite quantum system, which we might write heuristically as A'^ = 
say M2((D)°°. 

Formally, the infinite quantum system is the inductive limit C*-algebra 



UiveN^f (6.26) 

of the family of C*-algebras (Ai). Eq. 16.261 consists of all equivalence classes [A] = 
Aq of quasilocal sequences A = {Ai,A2, ■ ■ ■), under the equivalence relation A ~ _B iff 
limAr_j.oo II^AT — -BAf|| = 0. As the notation suggests, each Ai is contained in UArgNvAf' as a 
C*-subalgebra by identifying A^ E Ai with a quasilocal sequence that after the A^th term 
just tensors with the identity, viz. the sequence A = (0, ■ ■ ■ , 0, A^, Ajy^l, A^v®!®!, ■ ■ ■), 
and forming its equivalence class [A] = Aq in 

In Detail: A sequence A = {Ai, A2, ■ ■ ■ ) {An G Ai) is local if for some fixed M and 
aU A^ > M: Aa? = Am ® 1 ® ■ ■ ■ ® 1 (with N - M copies of the unit 1 e Ai). A 
sequence is quasilocal when for any 5 > there is an A^^ and a local sequence A' such that 
\\An - A'^W < e for all A^ > A^'^. 

We now define the inductive limit C*-algebra 



UiveN^f (6.27) 

of the family of C*-algebras {Ai) with respect to the inclusion maps Ai ^ Ai^^ given 
by An ^ An ® 1. As a set, 16.271 consists of all equivalence classes [A\ = Aq of quasilocal 
sequences A under the equivalence relation A B when lim7v-5.oo II^tv — Bn\\ = 0. The 
norm on UAreN-Af^ is 

\\Ao\\ = lim \\AnI (6.28) 

and other C*-algebraic structure is inherited from the quasilocal sequences in the ob- 
vious way (e.g., A^ = [A*] with A* = (AJ, ■ ■ ■ ), etc.). Thus each A^ is con- 
tained in UArgN-4.f^ as a C*-subalgebra by identifying An E Af with the local sequence 
A = (0, ■ ■ ■ , 0, An, An ® 1, An ® 1 ® 1, ■ ' ' forming its equivalence class Aq in 
UNenAi . 

So we define a second continuous field of C*-algebras A^'^^ over U 1 /N, with fibers 

= <• (6-29) 
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by declaring that the continuous sections are of the form {Aq, Ai, A2, ■ ■ ■ ) where {Ai, A2, ■ ■ ■ ) 
is quasilocal and Aq is defined to be the equivalence class of this quasilocal sequence, as 
just explained. 

For N < 00 this field has the same fibers 

■^i/AT ~ -^i/N ~ -^i (6.30) 

as Section I6.2.3f s continuous field A^'^^ , but the fiber ^q'^'' is completely different from 
^q'^^ For if Ai is noncommutative then so is A^q \ since it contains all Ai . 



6.2.5 Comparing the classical and quantum limits: classical states and the 
de Finetti theorem 

One natural way to study the relations between the fields A^^^ and A^"^^ is to consider 
those families of abstract states {ui,Ui/2, ■ ■ ■ , cji/at, ■ ■ ■ ) {uji/n is a state on Ai) that have 
appropriate limit states on both A^^^ and A^'^\ (Here 'appropriate' is made precise in 
terms of Section [6. 2. 2f s notion of a continuous field of states.) In this Section, I introduce 
the most important such family, the permutation-invariant states. We will see in Section 
l6.3.1l how they yield superselection sectors, especially in Section [6.2. li s prototype system, 
spin chains. 

Of course, any state Uq''^ on ^q'''' defines a state Wgi'vyv on each Ai by restriction; and 

the sequence of these states have the given oOq^^ as their appropriate limit state on ^q*^-* . If 
this sequence of states also converges with respect to the other continuous field of algebras 
i.e. converges to some state Wg*^^ on ^0'^'', then the given state ooq^'' is called classical. 

We specialize to an important class of classical states, viz. those that are "indifferent 
to" the label of the component systems (e.g. of the sites in the spin chain). Formally, 
we say that a state Uq^^ on ^g''^ is permutation-invariant if its restrictions to each of the 
Ai is invariant under the natural action of the symmetric group Sn on Ai . 

For our purposes, the important point about these states is that they give a close 
quantum analogue of the de Finetti representation theorem. Roughly speaking, this 
theorem says that any classical probability measure on an infinite Cartesian power prob- 
ability space X°° := X X X X ■ ■ ■ that is permutation-invariant (under permutations 
between copies of the factor space X) is a unique mixture of (i.e. has a unique in- 
tegral decomposition in terms of) infinite product probability measures p°° given by 
p°°{Yi X 1^2 X ■ ■ ■ ) '■= p{Yi) ■p{Y2) . . . . (with Yi in the sigma-algebra of the ith copy of X) for 
some probability measure p on X. Section [6. 3. II will describe how the quantum analogue 
of this theorem gives a precise yet general framework for the emergence of superselection. 

In Detail: Consider those families of states {ijJi,ijJi/2, ■ ■ ■ , Wi/at, ■ ■ ■ ) (where uji/n- is a 
state on A^) that have limit states both Wq^^ on ^g'^'' and and u^^^ on A'^\ such that the 
ensuing families {u}q^\ oji,Ui/2, ■ ■ ■) and {ul^\ui, ooi/2,- ■ ■) are continuous fields of states 
on A^^^ and on A^''^ (in the sense of Section r6.2.2p . 

Any state on ^g^^ defines a state oj^l/j^ on Ai by restriction, and the en- 
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suing field of states on A^"^^ is clearly continuous. (Conversely, any continuous field 
{ujI)'\uji,uji/2, ■ ■ . ,uJi/N, • • •) of states on A^'^'^ becomes arbitrarily close to a field of the 
above type for large.) But the restrictions c^gti/A^ ^ given state loq^^ on ^q'^^ to Ai 
may well not converge to a state cJq'^^ on ^ for —7- oo. States Uq^^ on UArgN^i' whose 
restrictions i^g'^l/jy converge to a state Uq'^^ on ^q'^^ are called classical. 

In other words (cf. the definition of A'^^ especially eq. I6.24p : lo^^^ is classical when 
there exists a probability measure /io on S{Ai) such that 

hm / d/xo(p)(p^(A^)-a;i;L(A^))=0 (6.31) 

for each approximately symmetric sequence {Ai, A2, . . .). In other words: a classical state 
oOq^^ with limit state oOq^^ on C{S{Ai)) defines a probability measure po on S{Ai) by 

4'^)= / dpof, (6.32) 
-'5(^1) 

which describes the probability distribution of the macroscopic quantities in that state. 

We now make this more concrete by specializing to an important class of classical 
states. We say that a state 00 on UArgpj^^ is permutation-invariant when each of its 
restrictions to Ai is invariant under the natural action of ©at on Ai (i.e. a G ©at maps 
an elementary tensor A^ = -Bi (g) ■ ■ ■ ® -Btv G Ai to -Bcr(i) ® ■ ■ • ® -Bo-(Ar), cf. I6.17p . 

We can now state the quantum an alogue of the de Finetti representation theorem: a 
permutation-invariant state on ^q'^'' = U7vgn»4.^ is a unique mixture of (i.e. has a unique 
integral decomposition in terms of) infinite product states p°°, that are defined (with 
p G 5(^1)) by saying that if Aq G A'^^ is an equivalence class [Ai, A2, ■ ■ ■ ], then (cf. I6.24p 

p°°(Ao) = lim p^(A^) . (6.33) 

An equivalent definition is to say that the restriction of p°° to any A-^ d Aq is given by 
®^p. 

In other words, the theorem says: any permutation-invariant state u^^'^ has a unique 
decomposition 

Jo'\Ao)= [ t/p(p)p°°(Ao), (6.34) 

where p is a probability measure on 5(^1) and Ao G Ai'\ We can also state this in more 
geometric language, as follows. The set 5® of all permutation- invariant states in S{A'^^) 
is a compact convex set, and is the (weak*-closed) convex hull of its (extreme) boundary 
deS'^ . So the claim of the theorem is that this boundary consists of the infinite product 
states (and so is isomorphic to S{Ai) in the obvious way). 

To sum up this Section, especially 16. 31^ 16.321 and 16.341 — If is permutation- 
invariant, then it is classical. The associated limit state Wg^'* on ^q'^'' is characterized 
by the fact that the measure po in 16.321 coincides with the measure p in 16.341 
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6.3 The claims illustrated by emergent superselection 

The material in Section 1612) gives a precise and powerful framework for understanding the 
emergence of superselection in the N ^ oo limit. It will be clearest to first summarize 
this (Section 16.3. 1( again following Landsman (2006)), and then spell out the illustrations 
of my claims (Section 16.3.21 et seq.). 



6.3.1 Superselection from permutation-invariant states, in spin chains 

I begin with generalities, and then return to spin-chains. In the algebraic approach to 
quantum theory, a superselection sector is taken to be an appropriate equivalence class 
(under unitary isomorphism) of irreducible representations of the system's abstract the 
algebra of quantities A. The word 'appropriate' reflects the fact that most irreducible 
representations of a typical C*-algebra A used in physics are physically irrelevant, and so 
need to be excluded; (jargon: one needs a selection criterion). Here, we take the algebra of 
quantities to be AI^'^; and take an (equivalence class of) irreducible representations of A^q^ 
to be a superselection sector iff it corresponds to a permutation-invariant pure state on 
^o'^''. (Here 'correspond' is made precise using the GNS theorem, viz. as equivalence to the 
GNS-representation of the permutation- invariant state.) With this selection criterion, the 
results in Section l6T2l especially at 16.331 onwards, trivially imply that there is a bijective 
correspondence between pure states on Ai and superselection sectors of A^\ 

The results are vividly illustrated by spin chains. In Section 15. 2. 1^ I did not give details 
about how to build the infinite tensor product state space H'^, with say "Hi =0^. But as 
one would hope, we have: If (cj) is some basis of (D^, an orthonormal basis of "Hf^ consists 
of all different infinite strings e^^ ® ■ ■ ■ ei„ ® ■ ■ ■ , where ej„ is Cj regarded as a vector in (D^. 
(And similarly, when we choose the "building-block" algebra Ai 7^ C^.) We denote the 
multi-index (ii, ...,«„,...) simply by /, and the corresponding basis vector by e/. This 
Hilbert space 'H'^ carries a natural faithful representation tt of ^o'^'': if Aq G A'^'' is an 
equivalence class [y4i,y42, ■ ■ ■], then 7r(y4o)e/ = lim^^oo A^Ci, where A^ acts on the first 
components of e/ and leaves the remainder unchanged. 

Now the important point is that although each Ai acts irreducibly on T-L^ , the rep- 
resentation 7r{A''Q^) on Ti'^ thus constructed is highly reducible. The reason for this is 
that by definition (quasi-) local elements of ^q'''' leave the infinite tail of a vector in "H^ 
(almost) unaffected, so that vectors with different tails lie in different superselection sec- 
tors. Without the quasi- locality condition on the elements of A^q \ no superselection rules 
would arise. 

For example, in terms of the usual basis < t= ( n ) I ? ) r vectors 



i]/^ =t ® t ■ ■ ■ t ■ ■ ■ (i-6- an infinite product of 'up' vectors) and =1 ® l ■ ■ ■ l ■ ■ ■ (i.e. 
an infinite product of 'down' vectors) lie in different sectors. (Cf. items (a) and (b) in 
Section [6. 2. H and (iv) in Section [6.2.31 ) The reason why the inner product {'^^,7i{A)'ifi) 

vanishes for any A G ^g''^ is that for local quantities A one has tc{A) = 74jv/(8>1®---1--- 
for some Am & B{'Hm)'i the inner product in question therefore involves infinitely many 
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factors (t, 1 i) = itji) = 0- Fo^' quasilocal A the operator vr(A) might have a small 
nontrivial tail, but again the inner product vanishes by an approximation argument. 

More generally, elementary analysis shows that tt{A)'$v) = whenever = ^°°u 
and = ®°°f for unit vectors u,v G (D^ with u ^ v. The corresponding vector states 
i/ju and ipv on ^q'^'' (i.e. = 7r(A)\I'u) etc.) are obviously permutation-invariant 

and hence classical. Identifying iS(M2((D)) with C M^, the corresponding limit state 
{'^u)o on ^Q*^^ defined by ipu is given by (evaluation at) the point u = {x, y, z) of dB^ = S'^ 
(i.e. the two-sphere) for which the corresponding density matrix p{u) is the projection 
operator onto u. (It follows that ipu and ipv are unitarily inequivalent.) 

We conclude that each unit vector m G determines a superselection sector 7r„, namely 
the GNS-representation of the corresponding state ipu, and that each such sector is realized 

as a subspace Hu of (viz. Uu = 7r{Al^^)^u)- Moreover, since a permutation-invariant 
state on ^q'^'' is pure iff it is of the form ipu, these are all the superselection sectors. Thus 
we have the subspace (of Tif) and subrepresentation (of vr) 

"He := ®u£S'^T^u', 
vre(4'^) := ®ues-TXu{A^^), (6.35) 

where vr^ is simply the restriction of tt to 1-Lu C. 'H'^. 

In the presence of superselection, there are operators that distinguish different sectors 
whilst being a multiple of the unit in each sector; cf. items (c) and (d) of Section 16.2.11 
In the framework developed from Section 16.2.21 onwards, these operators are the macro- 
scopic quantities of Section [6.2.31 In fact, one can show for any approximately symmetric 
sequence {Ai, A2,- ■ ■) the limit 

A= lim 7r6(Ajv) (6.36) 

exists in the strong operator topology on B(1-Lg). Moreover, let Aq G A!'^^ = C{S{Ai)) 
be the function defined by the given sequence (recall that Ao{u) = lim.iy^oo^'^ (Ajy), cf. 
eq. I6.24p . Then the map Aq A defines a faithful representation of ^q'^'* on "He, which 
we again call vre- A calculation shows that 7re(Ao)^ = ^o(w)^ for "if G Tiu, or, in other 
words, 

7ie{Ao) = ©^gs2Ao(M)lw„. (6.37) 
Thus the 7re(Ao) are indeed the promised operators. 

6.3.2 Emergence in the limit: with reduction — and without 

I turn at last to how the example of superselection illustrates my claims. It is clear 
from the wealth of details in the preceding Sections that my positive claims, (l:Deduce) 
and (2:Before), can be richly illustrated, in terms of both quantities and states. So I 
will confine myself to mentioning the obvious illustrations, and referring to the previous 
discussions and equations. 
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As to (l:Deduce): we again have 'reduction as deduction' in as strong a sense as 
you could demand — provided we take the hmit. There are several illustrations, some 
concerning quantities or algebras of them, some concerning states. I begin with two brief 
cases, referring back to equations in Subsections of Section 16.21 

An obvious case, which concerns quantities, is that symmetric sequences (^at) of 
quantities, Ajy G , commute in the limit N ^ oo. Recall eq. 16.211 in Section 16.2.31 
Another obvious case, which equally concerns states, is spin density in spin chains, as in 
part (c) of Section 16.2.11 Thus recall the limiting behaviour of the average, over 2A^ + 1 
sites, of the spin matrices, i.e. the limiting behaviour of m^'* defined by eq. 16.51 The 
discussion from eq. 16.61 to eq. 16.91 deduced that the limiting spin density has the same 
value, k, for all states in the representation 7^*^+); and so it is a classical quantity, or 
superselection operator. 

Here are three more substantial cases of such a deduction, using my mnemonic nota- 
tions, Tb and Tf. that is, cases where Th implies Tt. The first two concern quantities, the 
third concerns states. Here, I shall mostly refer back to the summary in Section [6.3.11 

(A) : Take as Tf, the continuous field of C*-algebras A^''^ together with its represen- 
tation theory: or more modestly, together with that part of its representation theory 
that deals with permutation-invariant states. Take statement of superselection, 
e.g. that A^Q^ acts (highly!) reducibly on the infinite-system state space H'^. Then we 
have just seen in Section [6.3.11 that Tb implies Tt. For the requirement for sequences of 
operators to be quasi-local makes the inner products between states in different sectors 
of Tif vanish. Compare the discussion leading up to eq. 16.351 

(B) : Let Tb be the continuous field of C*-algebras A^''^ together with vre, the faithful 
representation of ^g'^'' on He defined by eq. 16.361 Take Tt to state that there are super- 
selection operators (classical quantities) that restrict to the identity on each sector "He- 
Then we have just seen at the end of Section [6.3.11 that Tb implies Tt. For compare the 
discussion leading up to eq. I6.37t any approximately symmetric sequence (An) (and so 
any continuous section of the field) defines such a classical quantity, viz. ttqIAq). 

(C) : We can focus instead on states. Take Tb to encompass both continuous fields of 
C*-algebras, ^^^^ and A^^\ and to consider families of states on them. Take Tt to be the 
theory of classical states, especially permutation- invariant, states uJq^^; or more modestly, 
to be just the quantum de Finetti representation theorem, eq. I6.34[ Then Section 16.2.51 
shows that Tb implies Tj. 

I turn to the "the other side of the coin" of (l:Deduce): the failure of the reduction 
at finite A^. For (A) and (B), the point is familiar from elementary quantum theory. 
Finite means there is no algebra A'^'^ or ^q'^'' to consider: there are only the algebras 
A^^^ = ^j^^ := Ai . But in general, if Ai acts irreducibly on Hi, then so does Ai on Hi . 
So (A) fails. And then Schur's Lemma — that if an algebra acts irreducibly, its commutant 
is trivial — immmediately implies that (B) fails, i.e. there are only trivial superselection 
operators. 

As to (C), the point is less familiar. But it is again straightforward, especially if 
we limit ourselves by choosing Tt to be the quantum de Finetti representation theorem. 
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not the whole theory of classical states; (of course choosing a weaker Tf adds to the 
dialectical force of the point that it is not implied). Here the point is that for finite A^, 
the theorem fails. The reason is essentially the same as for the corresponding failure of 
the classical theorem for finite N. In short, finite corresponds to drawing from an 
urn without replacement, rather than with replacement. In the classical case, this means 
that a permutation-invariant probability measure is a unique mixture, not of product 
measures, but of hypergeometric measures (cf. Diaconis 1977, Diaconis and Freedman 
1980, Jeffrey 1988, pp. 240-245). Similar remarks apply in the quantum case: (C) of 
Section 16.3.31 will give a few more details. 

Finally, I note two respects in which this example's illustration of (l:Deduce) is similar 
to that given by fractals, and dissimilar to that given by the method of arbitrary functions. 
I noted these respects at the end of Section [5.2.11 First: the emergent behaviour involves 
mathematical structures which are new at the limit (such as unitarily inequivalent rep- 
resentations, and so superselection). This is like fractals, where the emergent behaviour 
involved non- integer dimension at the limit; but unlike the method of arbitrary functions, 
where the emergent behaviour involved, more simply, deducing the limiting behaviour of 
functions given at finite N. 

The second respect is related to the first. Namely, this example considers an infinite 
quantum system: in Section l3TT? s notation, an infinite system cr(oo). Again, this is like 
fractals, where the Cantor set, Koch snowflake etc. count as infinite systems; but unlike 
the method of arbitrary functions, since, for example, no roulette wheel has infinitely 
many arcs. 

6.3.3 Emergence before the limit 

(2:Before) claims that before the limit, there is emergence in a weaker but still vivid 
sense. This claim is illustrated in a manner parallel to my previous examples: the method 
of arbitrary functions, and fractals. In short, one just has to interpret (sensibly!) the 
example's formalism describing finite A^. And as in those examples, the discussion can 
be made vivid by referring to practical purposes. There, I cited the practical purposes 
of a casino in making a wheel that is fair enough; and the purposes of a film studio in 
making an image look fractal at small enough spatial scales. Similarly here: imagine an 
experimental physicist making, or a theoretical physicist describing, a sample or device 
comprising A^ atoms or sites (perhaps for a nanotechnology project), with A^ large enough 
for behaviour characteristic of superselection to occur. 

Again, one can pick several illustrations. I shall give three. The first concerns states in 
spin chains as in Section 16.2. 1( the second concerns symmetrized quantities as in Section 
I6.2.3j the third returns us to the quantum de Finetti representation theorem, as in Section 
16.3.21 All three have the merit (absent from my discussion of my previous examples!) of 
being quantitative about the "rate of emergence" : about how large an A^ is needed for 
emergent behaviour. 

(A): The discussion in part (c) of Section [6.2.11 concentrated on the limiting behaviour 
of the expectation value of rn}^^ on states in 'H'^^K We deduced (cf. eq. 16.61 to eq. 16. 9p 
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that the hmiting value is k, the unit vector in the z-direction. But that discussion can 
of course be adapted to entail statements about the situation for finite A^. To give one 
example: consider the x or y component of the matrix element of m}^^ between two 
elements (f)^!3^\ (f)^^^ of the basis of 'H*^^-' that correspond to two doubly infinite (+1,-1)- 
sequences, s = and s' = both with all but finitely many elements (s„ or 

s'^) equal to +1. In part (c) of Section [6.2.H we argued that these matrix elements tend 
to as — )■ oo. But as to finite N: if we know that s„ = for all n with \n\ > M, we 
can readily calculate how large needs to be for the 2N + 1 denominator to make the x 
or y component of the matrix element have modulus less than any given e. 

(B) : The discussion surrounding eq. 16.221 and 16.231 in Section [5.2.11 concentrated on 
the limiting behaviour, as — )■ oo, of the commutator of averages over A^ particles of 
1-particle spin operators. But as in (A), that discussion of course entails statements about 
the situation for finite A^. Indeed, eq. I6.22l and I6.23l say explicitly that these commutators 
are proportional to 1/A^. 

(C) : Section [6.3.21 reported that the quantum de Finetti representation theorem, eq. 
I6.34[ failed for finite A^: a permutation-invariant state on A^ quantum systems need not 
be mixture of product states. But in recent years, various finite- A^ analogues have been 
proven, with the following flavour: any state of M systems, with M < N, that is obtained 
from a permutation-invariant state on A^ systems, by tracing out (partial tracing over) 
N — M systems, can be approximated by a mixture of product states on the M systems 
(i.e. states oj'^^), with an error that goes to zero with the ratio M/N, e.g. an error like 
0{M/N); (Koenig and Renner 2005, Renner 2007, Koenig and Mitchison 2007). 

Note finally that as with my previous two examples, in these three illustrations of 
(2:Before) we again see the Straightforward Justification of Section 13.3.31 in action. I 
will not labour the point: I will just quote Landsman's own statement of the idea of 
that Section's fifth paragraph, as applied to the averaging process used (in Section 16.2.31) 
in defining macroscopic quantities. Thus Landsman writes: 'the limit A^ — >■ oo is valid 
whenever averaging over A^ = 10^^ particles is well approximated by averaging over an 
arbitrarily larger number A^ (which, then, one might as well let go to infinity)' (Landsman 
2006, preamble to Section 6; p. 493). 

6.3.4 Supervenience is a red herring 

As in two previous examples, I shall be brief about my third claim, (3:Herring): that 
although various supervenience theses are true, they give little or no insight into the 
emergent behaviour, or more generally into "what is going on" in the example. The 
reason is as in the previous examples: there is no connection between supervenience's 
idea of a variety of ways to have a higher-level property P (in particular the example's 
emergent property) and the limit processes on which the example turns. 

But to save space, I will not formulate any such supervenience theses. I leave it as 
an exercise (!) to formulate how a system's having an emergent property, such as a 
specific value for a specific classical quantity, or a superselection operator, supervenes 
on the system's microstate, i.e. on the sequence of states assigned to the algebras Ai 
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for successively larger A^-particle (or site) sub-systems. Suffice it to say here that the 
important points will be that: 

(i) the value (indeed, even just the well-definedness) of the quantity or operator, in 
the strong sense of (l:Deduce) but not the weak sense of (2:Before), needs one to take the 
limit — 7- oo, in the classical (Section 16.2. 3p and-or quantum (Section I6.2.4p version; cf. 
the end of Section I6.3.1j 

(ii) the idea of supervenience on the microstate makes no connection with taking this 
limit. 

6.4 Summing up superselection 

To emphasise the parallel between this long example and the previous two, let me sum 
up with a list of six morals which are parallel to those in Section 15.41 As in that Section, 
it would be flogging a dead horse to again make explicit my four claims, or Section [3. 3. 3f s 
Straightforward Justification, or the parallels with previous examples. 

(i) : The large finite is often well-modelled by the infinite. 

(ii) : Such models are often justified in a straightforward, even obvious, way, by math- 
ematical convenience and empirical success. 

(iii) : The infinite often brings new mathematical structure: in this example, superse- 
lection, and associated notions like unitarily inequivalent representations. 

(iv) : Nevertheless, there is often a reduction: the emergent superselection properties, 
and the associated behaviour, are reducible to a sufficiently rich theory that takes the 
infinite limit. 

(v) : On the other hand, one can often see emergent behaviour on the way to the limit. 
Indeed, the larger your error bar, e.g. for detecting experimental statistics characteristic 
of some commuting operators, the lower the number of particles (or lattice sites) N for 
which your experiments will conffim (more precisely: you think your experiments con- 
ffim!) the properties and behaviour that are characteristic of superselection. 

(vi) : Various supervenience theses hold — but they are trivial, or at least scientifically 
useless. 

7 Phase transitions 

Lack of space means I must deal much more briefly with my fourth example. This is un- 
fortunate, for two reasons. Scientifically, this example represents a much larger and more 
controversial topic than my previous ones. And philosophically, it has been a prominent 
example in recent controversy about whether the N = oo limit is "physical real", and 
whether a "singular" limit is necessary for emergence (e.g. Callender (2001, Section 5, 
pp. 547-551), Liu (2001, Sections 2-3, pp. S326-S341), Batterman (2005, Section 4, pp. 
233-237); and more recently, Mainwood (2006, Chapters 3,4; 2006a), Bangu (2009, Sec- 
tion 5, pp. 496-502), Menon and Callender (2011, especially Sections 3, 4)). But sufficient 
unto the day is the work thereof! I have already declared my general position in these 
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controversies (especially in Sections [T] and [3]) . So in this Section, it will be enough to 
sketch: (i) how statistical mechanics treats phase transitions by taking a limit, in which 
the number of constituent particles (or sites in a lattice) N goes to infinity (Section l7.ip : 
and (ii) how this treatment illustrates my claims (Section 17. 2p . 

For more information, I especially recommend: (i) accounts by masters of the subject, 
such as Emch and Liu (2002, Chapters 11-14) and Kadanoff (2009, 2010, 2010a), which 
treat the technicalities and history, as well as the conceptual foundations, of the subject; 
and (ii) Mainwood (2006, Chapters 3,4; 2006a), to which I am much indebted, especially 
in Section 17.2. If s treatment of phase transitions in finite N systems, and Section 17.2. 2f s 
discussion of cross-over. 

7.1 Phase transitions and thermodynamics 
7.1.1 Separating issues and limiting scope 

This example is an aspect of a very large topic, the "emergence" of thermodynamics from 
statistical mechanics — around which debates about the reducibility of one to the other 
continue. This topic is very large, for various reasons. Three obvious ones are: (i) both 
thermodynamics and statistical mechanics are entire sciences; (ii) statistical mechanics, 
and so this topic, can be developed in either classical or quantum terms; (iii) there is no 
single agreed formalism for statistical mechanics (unlike e.g. quantum mechanics). 

Phase transitions are themselves a large topic: there are several classification schemes 
for them, and various approaches to understanding them — some of which come in both 
classical and quantum versions. I will specialize to just one aspect; which will however 
be enough to illustrate my claims. Namely: the fact (on most approaches!) that for 
statistical mechanical systems, getting a (theoretical description of) a phase transition 
requires that one take a limit (often called 'the thermodynamic limit'), in which the 
number of constituent particles (or sites in a lattice) N goes to infinity. In brief, this 
means something like: both the number N of constituent particles (or sites), and the 
volume V of the system tend to infinity, while the density p = N/V remains fixed. More 
details in Section 17.1.21 

Even for this one aspect, I will have to restrict myself severely. Three main limitations 

are: 

1) : As regards philosophy, 1 impose a self-denying ordinance about the controversies 
mentioned in this Section's preamble: apart from what I have already said in Sections [T] 
and [3] — and one remark in Section 17.2.11 

2) : In physics, how to understand phase transitions is an ongoing research area. For 
our purposes, the main limiting (i.e. embarrassing!) fact is that most systems do not have 
a well-defined thermodynamic limit — so that all that follows is of limited scope. 

3) : Various detailed justifications can be given for phase transitions requiring us to 
take the thermodynamic limit; and Section [7. 1.21 will only sketch a general argument, and 
mention two examples. Much needs to be (and has been!) said by way of assessing these 
justifications — but I will not enter into this here. But by way of emphasizing how open all 
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these issues are, I note that some physicists have developed frameworks for understanding 
phase transitions without taking the thermodynamic hmit (Gross (2001)). 



7.1.2 The thermodynamic limit 

1 will first give a broad description of the need for the thermodynamic limit; and then 
a classical and a quantum example to show how the limit secures new mathematical 
structure appropriate for describing phase transitions; (Section 17.1. 2I A). Then in Section 
17.1. 2[ B. I will give a classical example of approaching the limit. This example will be the 
phase transition of a ferromagnet at sub-critical temperatures. It has the merit of being 
very simple, and of developing Section I3.3.2f s dissolution of the "mystery" of describing 
a finite- system with a model using infinite A^. 

7. 1.2. A: The need for the limit — For classical physics, the brutal summary of why we 
need this limit is as follows. Statistical mechanics follows thermodynamics in representing 
phase transitions by non-analyticities of the free energy F. But a non-analyticity cannot 
occur for the free energy of a system with finitely many constituent particles (or anal- 
ogously: lattice sites). So statistical mechanics considers a system with infinitely many 
particles or sites, = oo. One gets some control over this idea by subjecting the limiting 
process, A^ — t- oo, to physically- motivated conditions like keeping the density constant, 
i.e. letting the volume V of the system also go to infinity, while N/V is constant. As 
we would expect — especially given my previous examples! — this infinite limit gives new 
mathematical structures: which happily turn out to describe phase transitions — in many 
cases, in remarkable quantitative detail. 

But to make my ferromagnet example comprehensible, I need to spell out this line of 
argument in a bit more detail. In Gibbsian statistical mechanics, we postulate that the 
probability of a state s is proportional to exp{—H{s)/kT) = exp(— /3if(s)), where (3 := 
is the inverse temperature and k is Boltzmann's constant. That is: 



where the normalization factor Z, the partition function, is the sum (or integral) over all 
states, and defines the free energy F as: 



Thus F is essentially, the logarithm of the partition function; which is itself the sum 
(or integral) over all states of the exponential of the Hamiltonian. It turns out that Z 
and F encode, in their functional forms, a great deal of information about the system: 
various quantities, in particular the system's thermodynamic quantities, can be obtained 
from them, especially by taking their derivatives. For example, in a ferromagnet, the 
magnetization is the first derivative of the free energy with respect to the applied magnetic 
field, and the magnetic susceptibility is the second derivative. 

Now, broadly speaking: phase transitions involve abrupt changes, in time and-or space, 
in thermodynamic quantities: for example, think of the change of particle density in a 




(7.1) 




(7.2) 
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solid-liquid, or liquid-gas, transition. Thermodynamics describes these changes as dis- 
continuities in thermodynamic quantities (or their derivatives), and statistical mechanics 
follows suit. This means that the statistical mechanical description of phase transitions 
requires non-analyticities of the free energy F. But under widely applicable assumptions, 
the free energy of a system with finitely many constituent particles (or analogously: sites) 
is an analytic function of the thermodynamic quantities within it. For example, we will 
soon see that in the Ising model with N sites, the Hamiltonian if is a quadratic polyno- 
mial in spin variables (cf. eq. 17. 4p . This means that the partition function Z, which by 
eq. 17.21 is a sum of exponentials of —/3H, is analytic; and so also is its logarithm, and the 
free energy. (This and similar arguments about more general forms of the Hamiltonian (or 
partition function or free energy), are widespread: e.g. Ruelle (1969, p. 108f.), Thompson 
(1972, p. 79), Le Bellac (1991, p. 9), Lavis and Bell (1999, pp. 72-3)) . 

I of course admit that — as my phrases 'broadly speaking' and 'under widely applicable 
assumptions' indicate — this argument why phase transitions need the thermodynamic 
limit is not a rigorous theorem. Hence the effort mentioned in Section 17.1.11 to develop a 
theory of phase transitions in finite systems; and the philosophical debate among Callender 
et al. mentioned in this Section's preamble. Hence also the historical struggle to recognize 
the need for infinite systems: both Emch and Liu (2002, p. 394) and Kadanoff (2009, p. 
782; 2010, Section 4.4) cite the famous incident of Kramers putting the matter to a vote 
at a meeting in memory of Van der Waals in 1937@ 

But this argument, although not a rigorous theorem, is very "robust" — and recognized 
as such by the literature. For example, Kadanoff makes it one of the main themes of his 
recent discussions, and even dubs it the 'extended singularity theorem' (2010, Sections 2.2, 
6.7.1; 2010a, Section 4.1). He also makes it a playful variation on Anderson's slogan that 
'more is different' (as I mentioned in footnote [1]) . Namely, he summarizes it in Section 
titles like 'more is the same; infinitely more is different' (2009, Section 1.5; 2010, Section 
3). In any case, for the rest of this paper, I accept the argument. 

Taking the thermodynamic limit introduces new mathematical structures. But (as 
one might expect from my previous examples) the variety of formalisms in statistical 
mechanics (and indeed, the variety of justifications for taking the limit) means that there 
is a concomitant variety of new structures that in the limit get revealed. I mention one 
classical, and one quantum, example. 

In Yang-Lee theory (initiated by Yang and Lee 1952), one uses complex generalizations 
of the partition function and free energy, and then argues that for any z G (D, there can 
be a phase transition (i.e. a non-analyticity of F or Z) at z only if there are zeroes of 
Z arbitrarily close to z. For finite A^, Z has finitely many zeroes, so that there can be a 

^'^Of course, since I have not precisely defined 'thermodynamic limit' — let alone 'phase transition' ! — the 
argument could hardly be a rigorous theorem. Ruelle (1969, Sections 2.3-4, 3.3-5) rigorously discusses 
conditions for the thermodynamic limit; (cf. also Emch (1972, p. 299; 2006, p. 1159) Lavis and Bell 
(1999a, pp. 116, 260)). Such discussions bring out how in some models, the limit is not just the idea 
that, keeping the density constant, the number N of molecules or sites tends to infinity: there are also 
conditions on the limiting behaviour of short-range forces. This means that the models eventually run 
up against both aspects of my fourth claim, (4:Unreal), i.e. against what Section [2] called 'atomism' and 
'cosmology'. 
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phase transition only at the zeroes themselves: but all of them lie off the real line, and 
so are not physical. Taking the limit N ^ oo "breaks" this last argument: there can be 
a curve of zeroes that intersects the real axis. Indeed, in Yang-Lee theory one goes on 
to classify phase transitions in terms of the behaviour of the density of zeroes in (D: (cf. 



Thompson 1972, pp. 85-88; Ruelle 1969, pp. 108-112; Lavis and Bell 1999a, pp. 114, 



My quantum example concerns Gibbs states and KMS states. This follows on from 
Section Os discussion of superselection: especially Section 16.2.11 on unitarily inequivalent 
representations of an algebra of quantities, and these representations differing in the value 
of a global/macroscopic quantity. (For more details, cf. Emch (1972, pp. 213-223; 2006, 
Section 5.6-7, pp. 1144-1154); Sewell (1986, pp. 73-80; 2002, pp. 113-123); Emch and 
Liu (2002, pp. 346-357), Liu and Emch (2005, pp. 142-145, 157-161).) 

Thus we recall that the Gibbs state of a finite quantum system with Hamiltonian H 
at inverse temperature /3 = t4; is given by the density matrix 



and represents the (Gibbsian) equilibrium state of the system. (Note the beautiful analogy 
with eq.s 17.11 and 17.21 ) It is unique (for given (3): thereby precluding the representation 
of two phases of the system at a common temperature — as one would want for a phase 
transition!^ 

So how can we give a quantum description of phase transitions? The algebraic ap- 
proach to quantum statistical mechanics proposes some states, viz. KMS states, which 
are defined on infinite quantum systems and which generalize the notion of a Gibbs state 
in a way that is (a) compelling mathematically, and (b) well-suited to describing phase 
transitions. A word about each of (a) and (b): — 

(a) : A mathematical property of Gibbs states (the 'KMS condition') is made into a 
definition of an equilibrium state that is applicable to both infinite and finite systems: 
(for the latter it coincides with the Gibbs state at the given temperature). KMS states 
can be shown to have various stability or robustness properties that makes them very well 
suited to describe (stable) physical equilibria. (Emch (2006, Section 5.4, pp. 1128-1142) 
is a excellent survey of these properties. Such a survey brings out how KMS states could 
themselves form an example of emergent behaviour, in my sense of novel and robust prop- 
erties!) 

(b) : The set of KMS states at a given inverse temperature (3 is in general not a 
singleton set. Rather, it is convex, with: (i) every element having a unique expression as 
a mixture of its extremal points; and (ii) its extremal points being well-suited to describe 
pure thermodynamical phases (mathematically, they are factor states). Taken together, 
(i) and (ii) suggest that a compelling representative of the state of a system undergoing 
a phase transition at inverse temperature /3 is a non-extremal u G Ki^. 

^^This uniqueness also precludes spontaneous symmetry breaking, understood (as usual) as the al- 
lowance of distinct equilibria that differ by a dynamical symmetry — e.g. in Section 16.2. li s scenario, 
ground states each with all spins aligned, but in different spatial directions. Spontaneous symmetry 
breaking is (yet another!) important aspect of phase transitions which I cannot here pursue: a fine recent 
philosophical discussion is Liu and Emch (2005) . 



125-134). 
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So much by way of indicating justifications for taking "the" thermodynamic hmit. I 
turn to discussing the approach to the hmit. 

7.1.2.B: Approaching the limit: — I will give a classical example of approaching the 
limit — )■ oo of infinitely many sites in a lattice: namely, the phase transition (change 
of magnetization) of a ferromagnet at sub-critical temperatures, as described by the Ising 
model with sites (in two or more spatial dimensions). As I mentioned at the start of 
this Section, this will develop Section I3.3.2f s dissolving of the "mystery" of describing a 
finite- A^ system with a model using infinite A^. 

The Ising model postulates that at each of A^ sites, a classical "spin" variable a (which 
we think of as defined with respect to some spatial direction) takes the values ±1. To 
do Gibbsian statistical mechanics, i.e. to apply eq.s 17.11 and 17. 2^ we need to define a 
Hamiltonian and then sum over configurations. The Hamiltonian is chosen to give a 
simple representation of the ideas that (i) neighbouring spins interact and tend to be 
aligned (i.e. their having equal values has lower energy) and (ii) the spins are coupled 
to an external magnetic field which points along the given spatial direction. Thus the 
Hamiltonian is 

H = J T^nnCTpaq + / ScTp . (7.4) 

where: the first sum is over all pairs of nearest-neighbour sites, the second sum is over all 
sites, J (with the dimension energy) is negative to represent that the neighbouring spins 
"like" to be aligned, and J' is given by the magnetic moment times the external magnetic 
field. 

The simplest possible case is the case of A^ = 1! With only one site, the Hamiltonian 
becomes 

H = J'a- (7.5) 
so that if we define a dimensionless coupling h := —J'/kT, then eq.s 17.11 and 17.21 give 

prob(+l) = e'^/z and prob(-l) = e~''/z , with z = e'' + e''' = 2 cosh /i . (7.6) 

This implies that the magnetization, i.e. the average value of the spin, is 

(a) = e^/z - e-^jz = tanh h . (7.7) 

This is as we would hope: the statistical mechanical treatment of a single spin predicts 
the magnetization increases smoothly from -1, through zero, to +1 as the applied field 
along the given axis increases from minus infinity through zero to plus infinity. 

What about larger A^? The analytical problem becomes much more complicated 
(though the magnetization is still a smooth function of the applied field). But the ef- 
fect is what we would expect: a larger A^ acts as a brake on the ferromagnet 's response 
to the applied field increasing from negative to positive values (along the given axis). 
That is: the increased number of nearest neighbours means that the ferromagnet "lingers 
longer" , has "more inertia" , before the rising value of the applied field succeeds in flipping 
the magnetization from -1 to +1. More precisely: as A^ increases, most of the change in 
the magnetization occurs more and more steeply, i.e. occurs in a smaller and smaller 
interval around the applied field being zero. Thus the magnetic susceptibility, defined as 
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the derivative of magnetization with respect to magnetic field, is, in the neighbourhood of 
0, larger for larger N, and tends to infinity as — )■ oo. As Kadanoff says: 'at a very large 
number of sites ... the transition will become so steep that the causal observer might say 
that it has occurred suddenly. The astute observer will look more closely, see that there 
is a very steep rise, and perhaps conclude that the discontinuous jump occurs only in the 
infinite system' (2009, p. 783; and Figure 4; cf. also 2010, p. 20, Figure 5). 

Clearly, this example corresponds closely to that in Section I3.3.2f s dissolving of the 
"mystery" . And this general picture of the approach to the N ^ oo limits applies much 
more widely. In particular, very similar remarks apply to liquid-gas phase transition, i.e. 
boiling. There the quantity which becomes infinite in the N ^ oo limit, i.e. the analogue 
of the magnetic susceptibility, is the compressibility, defined as the derivative of the density 
with respect to the pressure. And I am happy to give a hostage to fortune: as I declared 
in Section I3.3.2f s footnote I believe this example is a good prototype for dissolving the 
corresponding alleged mystery in physics' other 'singular' limits. Though I cannot argue 
for that here, I note that the views of some philosophers discussing phase transitions seem 
to mesh with it: e.g. Mainwood's proposal in Section [7.2. II below: Bangu's appeal (2009, 
p. 497f.) to Bogen and Woodward's (1988) distinction between data and phenomena; 
and Menon and Callender (2011, Sections 3, 4). 

7.2 The claims illustrated by emergent phase transitions 

I will not now devote a Subsection to each of my three main claims, (l:Deduce), (2:Before) 
and (3:Herring), as I did for my first three examples. It would take too much space — and 
much more detail that Section [711 has supplied — to do so. Thus (l:Deduce) would require 
me to properly define: (a) a handful of novel and robust behaviours shown in phase transi- 
tions (a handful of T^s), and (b) a corresponding handful of statistical mechanical theories 
Tfc in which the behaviours are rigorously deducible if one takes an appropriate version 
of "the" thermodynamic limit, A^ — )■ oo — but not otherwise. Similarly, for (2:Before) and 
(3:Herring). Doing all that properly would require a Section as long as Section |6] ... and 
as to (3:Herring), I would anyway prefer not to fiog horses that by now should be dead! 

Instead, I will just summarize how phase transitions illustrate the three claims, and 
endorse a proposal of Mainwood's about emergence before the limit: i.e. about how 
to think of phase transitions in finite- A^ systems (Section 17.2. ip . Then I will briefiy 
report a remarkable class of phenomena associated with phase transitions, viz. cross- 
over phenomena; (Section I7.2.2p . These phenomena make emergence before the limit 
even more vivid than it was in my previous examples; for they show how an emergent 
phenomenon can be first gained, then lost, as we approach a phase transition. And this 
will illustrate (4:Unreal) as well as (2:Before). (For (ii), as for (i), I learnt what follows 
from Paul Mainwood (2006, especially Chapters 3 and 4). So all this Subsection owes a 
great deal to him.) 
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7.2.1 Emergence in the limit, and before it: Mainwood's proposal 



My main claims are my two reconciling claims, (l:Deduce) and (2:Before). Applied to 
phase transitions, they would say, roughly speaking: 

(l:Deduce): Some of the emergent behaviours shown in phase transitions are, when 
understood (as in thermodynamics) in terms of non-analyticities, rigorously deducible 
within a statistical mechanical theory that takes an appropriate version of the N ^ oo 
limit. But they are not deducible in a theory that sticks to finite N; so that if one 
concentrates on finite N, one will claim irreducibility. 

(2:Before): But these behaviours can also be understood more weakly; (no doubt, this 
is in part a matter of understanding them phenomenologically) . And thus understood, 
they occur before the limit, i.e. in finite-A^ systems. 

Here I admit that the phrases 'some of the emergent', 'appropriate version' and 'can 
be understood more weakly' are vague in ways which, as I said in the preamble, I have 
not the space to make precise: hence my saying 'roughly speaking'. But I still submit 
that the claims are true, for a wide class of emergent behaviours; and that Section 17.1.21 
gives good evidence for this. Moreprecisely, Section [7.1.2I A supports (l:Deduce), and 



Section ETIB supports (2:Before)0 By way of summing up: one can check that the six 
previous morals, (i) to (vi), that were used in Sections 15.41 and [6^ to sum up the fractals 
and superselection examples, apply again. 

Finally, 1 would like to briefiy report and endorse a proposal of Mainwood's (2006, 
Section 4.4.1, p. 238; 2006a, Section 4.1) which fits well with the swings-and-roundabouts 
fiavour of my combining (l:Deduce), especially its negative claim of non-deducibility at 
finite A^, with (2:Before). Mainwood's topic is, not emergence in general, but the recent 
philosophical debate about phase transitions in finite systems, especially as focussed by 
Callender's (2001, p. 549) presentation of four jointly contradictory propositions about 
phase transitions. Mainwood first gives a very judicious survey of the pros and cons 
of denying each of the four propositions, and then uses its conclusions to argue for a 
proposal that evidently reconciles: (a) statistical mechanics' use of the thermodynamic 
limit to describe phase transitions in terms of non-analyticities; and (b) our saying that 
phase transitions occur in the finite system. That is, to take a stock example: Mainwood's 
proposal secures that a kettle of water, though a finite system, can boil! 

Mainwood's proposal is attractively simple. It is that for a system with A^ degrees of 
freedom, with a free energy F^- that has a well-defined thermodynamic limit, F/v ~^ F^o, 
we should just say: 

phase transitions occur in the finite system iff Foo has non-analyticities. 
And if we wish, we can add requirements that avoid our having to say that small systems 
(e.g. a lattice of four Ising spins laid out in a square) undergo phase transitions. Namely: 
we can add to the above right-hand side conditions along the following lines: and if A^ is 
large enough, or the gradient of Fn is steep enough etc. (Of course, 'large enough' etc. 

^^Besidcs, a theory of phase transitions in finite systems of the kind argued for by Gross (mentioned in 
Section FZ.l.ll) would surely illustrate (2:Before), rather than refute (l:Deduce)'s negative claim of non- 
deducibility at finite N. For of course, a theory like Gross' cannot overturn pure mathematical arguments, 
for example about the analyticity of certain forms of partition function. 
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are vague words. But Mainwood thinks that the consequent vagueness about whether a 
phase transition occurs is acceptable; and I agree — after all, 'boil' etc. are vague.) 

7.2.2 Cross-over: gaining and losing emergence at finite 

I end by describing cross-over phenomena. I again follow Mainwood, who uses it (2006, 
Sections 4.4.2-3, pp. 242-247; 2006a, Section 4.2) to illustrate and defend his proposal 
for phase transitions in finite systems. I concur with that use of it. But my own aims 
are rather different. The main idea (mentioned at the start of Section [2] and the end of 
Section [5^ will be that cross-over phenomena yield striking illustrations of "oscillations" 
between (2:Before) and (4:Unreal). That is: a system can be: 

(i) first, manipulated so as to illustrate (2:Before), i.e. an emergent behaviour at finite 
A^; and 

(ii) then manipulated so as to lose this behaviour, i.e. to illustrate (4:Unreal); by the 
manipulation corresponding to higher, and unrealistic, values of iV; and 

(iii) then manipulated so as to either (a) enter a regime illustrating some other emer- 
gent behaviour, or (b) revert to the first emergent behaviour; so that either (a) or (b) 
illustrate (2:Before) again. 

In short: cross-over will illustrate my swings-and-roundabouts combination of (2:Before) 
and (4: Unreal). 

Besides, cross-over will illustrate a simpler point about (2:Before), which we already 
saw for my first two examples: viz., how the emergent behaviour that one "sees" at large 
but finite A^, can be "lost" if one alters certain features of the situation. We saw this: 

(a) for the method of arbitrary functions: by raising one's standard of how close to 
exact equiprobability was "close enough", or by widening the class of density functions 
with respect to each of which one required approximate equiprobability; (Sections 14.2. ![ 
KT72\) and 

(b) for fractals: by "getting better eyesight", i.e. by reducing the length-scale on which 
one could resolve spatial structure — and so see that at the given finite A^, there was not 
yet the infinitely descending tower of structure characteristic of a fractal; (Section 15. 2. 2p . 

Cross-over occurs near a critical phase transition. This is one where a quantity called 
the correlation length ^, which summarizes the average length-scale on which microscopic 
quantities' values are correlated, diverges (in the modest sense of growing without bound, 
reviewed in Section ISTTj) . Understanding many such transitions (quantitatively as well as 
qualitatively), and understanding cross-over in particular, is one of the great successes of 
the renormalization group (RG) techniques that have been developed over the last fifty 
years. 

Beware: some recent philosophical literature, waxing enthusiastic about the RG, sug- 
gests that every phase transition has a "singular" thermodynamic limit and-or infinite 
correlation length. Not so: witness the fact that until now, this Section has not had to 
mention the RG! More generally, I reiterate my point (e.g. in the preamble to Section [3] 
and in Section I6.2.2p that there can be emergent limiting behaviour, with nothing "singu- 
lar" about the limit. But enough admonition! I turn to describing cross-over; (for details. 
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cf. e.g. Yeomans (1992, p. 112), Cardy (1997, pp. 61, 69-72), Chaikin and Lubensky 
(2000, pp. 216, 270-3); Hadzibabic et al. (2006) is a recent example of experiments). 

As substances approach a critical phase transition, they typically show behaviour char- 
acteristic of one of a small number of universality classes. Cross-over happens when a 
substance appears to show behaviour characteristic of one universality class, but then 
suddenly changes to another as it is brought even closer to its critical point. To explain 
this, we first recall the basic idea of the RG, as follows. 

(1) : We define a space X coordinatized by the parameter values that define the mi- 
croscopic Hamiltonian, e.g. interaction strengths between particles, and the strength of 
the coupling to an external field; (and typically, also temperature). 

(2) : We define a transformation T on X designed to preserve the large-scale physics 
of the system. Typically, T is a coarse-graining, defined by local collective variables that 
take some sort of majority vote about the local quantities' values, followed by a rescaling, 
so that the resulting system can be assigned to a point in X@ 

(3) : This assignment of the resulting system to a point within X enables one to con- 
sider iterating T, so that we get a flow on X. Critical points where ^ diverges will be 
among the fixed points of this flow. For the fact that ^ diverges means that the system 
"looks the same" at all length-scales, so that T fixes (makes no change in) the description 
of the system. (Besides, this scale-invariance can involve power-law behaviour on all scales 
and self-similarity, and so lead to the use of fractals, e.g. to describe the distribution of 
sizes of the "islands" of aligned spins in a two- or three-dimensional Ising model; cf. the 
end of Section 15. 3p 

I can now describe cross-over. I choose a kind called finite-size cross-over. This occurs 
when the ratio of ^ to the system's size determines the fixed point towards which the 
RG flow sends the system. When ^ is small compared to the size of the system, though 
very large on a microscopic length-scale, the system flows towards a certain fixed point 
representing a phase transition; and so exemplifies a certain universality class. Or to put it 
more prosaically: coarser and coarser (and suitably rescaled) descriptions of the system are 
more and more like descriptions of a phase transition. So in the jargon of my claims: the 
system illustrates (2:Before). But as ^ grows even larger, and becomes comparable with 
the system size, the flow crosses over and moves away — in general, eventually, towards a 
different fixed point. In my jargon: the system runs up against (4:Unreal), and goes over 
to another universality class — eventually to another behaviour illustrating (2:Before). 

Of course, the correlation length will only approach a system's physical size when the 
system has been brought very close indeed to the phase transition, well within the usual 
experimental error. That is: until we enter the cross-over regime, experimental data about 
quantities such as the gradient of the free energy will strongly suggest non-analyticities, 
such as a sharp corner or an infinite peak. Or in other words: until we enter this regime, 
the behaviour will be as if the system is infinite in extent. But once we enter this regime, 
and the cross-over occurs, the appearance of non-analyticities goes away: peaks become 
tall and narrow — but finitely high. Again, in my jargon, we have: (2:Before) followed by 

^^Hcncc another playful variation by Kadanoff on Anderson's slogan (cf. footnote [1]) ; Section 6.4 of 
Kadanoff (2010) is called 'Less is the same'. 
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(4:Unreal). 

Finally, I remark that a similar discussion would apply to other kinds of cross-over, 
such as dimensional cross-over. For example, this occurs when the behaviour of a thin 
film crosses over from a universality class typical of three-dimensional systems to one for 
two-dimensional systems, as ^ becomes comparable with the film's thickness. 

7.3 Envoi 

I believe that my claims, in particular my two main ones, (l:Deduce) and (2:Before), are 
illustrated by many examples beyond the four I have chosen. For instance, to stick to 
the area of my main physics example, viz. the N ^ oo limit of quantum theory: there 
are Sewell's own examples of his scheme in Section I6.1.H and KMS states' description 
of thermodynamic phases (Section 17.1. 2[ A). Showing my claims in many such examples 
would indeed be strong testimony to the reconciliation of emergence and reduction. Work 
for another day! 
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